aplbrain / grand

Your favorite Python graph libraries, scalable and interoperable. Graph databases in memory, and familiar graph APIs for cloud databases.
Apache License 2.0
80 stars 6 forks source link

no nodes persisted in sqlite SQLBackend #50

Closed spranger closed 3 months ago

spranger commented 4 months ago

If I ran G = Graph(backend=SQLBackend(db_url="sqlite:///demo.db")) G.nx.add_node("A", foo="bar") demo.db contains tables (grand_Nodes, grand_Edges), but those are empty, no data from the nodes.

j6k4m8 commented 4 months ago

Confirming I can reproduce this error on the latest version; looks like we're creating a .journal but not flushing back down to the db file... Will address this asap!

acthecoder23 commented 4 months ago

I just found grand and grand-cypher today while looking for a graph database solution that doesn't require complex installation and such.

I am also interested in the status of this, and as an aside, I'd be interested in learning more about these projects/contributing.

acthecoder23 commented 3 months ago

it looks like there backends.SQLBackend doesn't have a commit call on SQLBackend._connection. Should be as simple as adding SQLBackend.commit() and SQLBackend.rollback() functions delegating to the SQLAlchemy connection object.

acthecoder23 commented 3 months ago

submitted a PR for this issue

j6k4m8 commented 3 months ago

Fixed by the awesome @acthecoder23 in #51!

acthecoder23 commented 3 months ago

@j6k4m8 do you have a list of TODOs that i can tackle? I have some cycles i can burn on this.

j6k4m8 commented 3 months ago

wow @acthecoder23 that rocks! would love to chat with you about some of your use-cases and how we can synergize; otherwise there are a few thoughts in https://github.com/aplbrain/grand/issues/41, including a Neo4j backend, a memgraph backend, a kuzu backend... dynamodb and neptune thoughts are also cool but probably require some cloud dollars... neo4j+dynamo could be perhaps combined into a single cypher backend, just thinking out loud...

alternatively i think there's a ton of really high-impact work in performance -- making the sql and dataframes backends more performant in particular (those seem to be our highest-traffic backends). scale (millions of edges to 100s of millions of edges) is a key capability here for sure!

also: do you have thoughts on other backends / other dialects we should support? those are definitely high-impact for helping the highest number of users possible!

some other ideas are adding more "shortcut" methods to backends, like the get_edge_count call etc; these improve runtime by reducing the number of times we have to iterate over verts/edges.

adding more benchmarks to codspeed is another big help; it will improve our ability to monitor when PRs help / hurt the library on speed which can be really important for a lot of our users. i.e., "what are some use-cases that people will commonly do, on large enough graphs that we can test it at PR time?" (examples: https://github.com/aplbrain/grand/pull/48)

spranger commented 3 months ago

Great work from all of you. Many thanks.