memgraph / gqlalchemy

GQLAlchemy is a library developed with the purpose of assisting in writing and running queries on Memgraph. GQLAlchemy supports high-level connection to Memgraph as well as modular query builder.
https://pypi.org/project/gqlalchemy/
Apache License 2.0
226 stars 32 forks source link

Add support for fetching a subgraph by Translator.get_instance() #319

Open mdkozlowski opened 4 days ago

mdkozlowski commented 4 days ago

Translator._parse_memgraph uses get_edges_from_db, which matches all edges in Memgraph. For large graphs this is slow and possibly unnecessary (like for my use-case).

I've rewritten the Translator and DGLTranslator classes for my use-case, by adding a Match parameter in .get_instance which is propagated down to the get_edges_from_db function. By providing a Match query on node labels and relationship properties, I can use the customised DGLTranslator to return subgraphs as part of a PyTorch/DGL Dataloader - making it very convenient for model training/prediction.

Is there a technical/conceptual reason the existing Translator class doesn't support optionally fetching a subgraph? My solution is specific to the use-case I have, and as-is, it definitely isn't ready for a PR. But if there is interest for this kind of functionality, I'd be happy to contribute.

antejavor commented 3 days ago

Hi @mdkozlowski, thanks for bringing this to our attention. 🚀

There is no technical reason for it not to allow optionality. It is more driven by time, effort, and priority.

Since this is possible, we would gladly accept your contribution there. Once you have something more general, feel free to open a PR so we can form a platform for the whole community to benefit from this.