Open cmungall opened 2 years ago
See https://github.com/cmungall/relation-graph-py
Note it may not necessary to wrap rdftab using PyO3, we can use any rdf library (we don't use the stanza field from rdftab)
Consider instead: https://github.com/balhoff/whelk-rs
@hrshdhgd @cmungall I was trying to get semsql
to work today in order to troubleshoot some issues I'm having with trying to use SqlImplementation
in OAK.
I had a lot of problems with version 0.1.7
of semsql
, so I installed the latest version, 0.2.0
, but now I'm getting this error: /bin/sh: relation-graph: command not found
For now, should I continue using semsql==0.1.*
(resolves to 0.1.7)?
/bin/sh: relation-graph: command not found
I think this is intentional on OAK's end because of the above error, but I just wanted to let you know that this came up as well:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
oaklib 0.1.34 requires semsql<0.2.0,>=0.1.6, but you have semsql 0.2.0 which is incompatible.
Hi @joeflack4 - always use the latest version. If you are having issues with RG file an issue here: https://github.com/balhoff/relation-graph/issues.
Did you get your issue resolved?
Using PyO3 for RDFTab is certainly possible, but I wasn't planning to do it because I'll be using LDTab going forward. We've used PyO3 for valve.py
and wiring.py
and are working on using it LDTab (using horned-owl
). We're happy to share our experience.
For this purpose, I think you're probably better off just porting RDFTab to Python.
@jamesaoverton - that makes sense.
the speed of rdflib is the main issue. even though we get very fast access once we have built the sqlite db, there are still cases where latency in the build is an issue. but certainly having this as an option seems reasonable.
I'm figuring medium term python bindings to horned-owl will solve a lot of use cases...
Please do be advised that you will encounter the following complex issues:
Do take these things into account while designing your build and deploy process. It took quite a while for us to figure out how to do this for Ensmallen.
Just linking the Slack thread that Chris opened: https://obo-communitygroup.slack.com/archives/C03D93DEALA/p1661527315827469
I agree with @LucaCappelletti94: Getting PyO3 to work has been the easy part, and cross-compiling binaries for packaging has been much tricker. With a lot of effort we have a workflow to compile for major architectures and push to PyPI using GitHub Actions. This has been tested but is not yet on production: https://github.com/ontodev/valve.py/blob/valve_rs_python_bindings/.github/workflows/build-and-publish-wheels.yml
Suggestions for improvements are welcome.
I have an experimental replacement for rdftab.rs:
https://github.com/INCATools/rdf-sql-bulkloader
this doesn't do any rust binding itself, it relies on https://github.com/ozekik/lightrdf for that part. If this is fruitful, we may want to coordinate with the devs of this to make sure they have best practice for releasing wheels etc
I am still doing perf tests (https://github.com/INCATools/rdf-sql-bulkloader/issues/1)
UPDATE the bulkloader now uses pyoxigraph which seems better supported
I added a general discussion for rust depenencies in OAK here:
https://github.com/INCATools/ontology-access-kit/discussions/247
As an alternative to wrapping rdftab is to directly load the
statements
table in Python. This will be slower, but it should be very straightforward if we skip loading of thestanza
field, which we don't use. It will also have the advantage that we don't need to do transformations to RDF/XML using riot or robot.