apache / age

Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.
https://age.apache.org
Apache License 2.0
2.83k stars 401 forks source link

Request for comments on using Apache AGE #1523

Open dpdjvhxm opened 5 months ago

dpdjvhxm commented 5 months ago

This RFC seeks comments on the following aspects of Apache AGE:

MironAtHome commented 3 months ago

I think it would be fair to say that having something like a "Cypher Query Tool" in the PgAdmin would be extremely helpful. Integrating it with Age graph viewer certainly would advance the product to the level, somewhat, expected of the well behaving graph database product. On the performance side, I have recently applied plpgsql procedure with for loop over dataset with 4 properties to create approximately 120000 records for a labeled graph node. It takes, roughly, 1 millisecond on Windows 10 core intel laptop and SSD drive to create a record. In all 120,000 rows complete roughly in 2 minutes. In my humble opinion this is extremely slow. This means that 1 billion row table will take 1 million seconds to commit, this is 11 days. Having a scalable way to commit table worth of records in standard rowset postgres store to graph representation may be a nice improvement. A fairly common speed, to keep in mind, 25,000 rows / second for simple straightforward adhock script with just "CREATE" statement, and for a bulk the more or less common is a few million rows committed / second ( depending on how the connection is handled, for instance, TCP based connection vs. Shared Memory connection with Shared Memory being thought of as significantly faster one ). Please note, I am purposefully avoiding any complex architecture such as clusters. A competitive chart of graph database record ingestion speeds can be easily searched using your favorite search engine. There are well established "number sense" values to look for.

markgomer commented 3 months ago

@MironAtHome, have you, by any chance, applied this same test to another graph database tool, such as Neo4j, for instance?

MironAtHome commented 3 months ago

@markgomer not really in a shareable way, some work I did related to different graph db engines and it's tied to specific projects. I do have an esoteric ETL benchmark under works that I plan to use as a standard benchmark. Will share once I have completed its comparison across various graph engines. Here its legacy overview, since then FAA data has mutated a lot and framework needs to be reworked to retain relevance to real world data. I see it as a project in an of itself, so, its not something with a quick turnaround timeframe. But I did follow on your ask and performed search on graph engine performance comparison and found quite a few links. Unfortunately nothing looked like a "bulk load time" that I could share here. Will update if something comes my way.

markgomer commented 3 months ago

Thanks for your effort @MironAtHome! Please do share any findings here when you have it!

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 45 days with no activity. Remove "Abondoned" label or comment or this will be closed in 7 days.