INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
214 stars 54 forks source link

Ensure that update-constraints.sh updates to the latest versions of certain tools #895

Closed matentzn closed 5 months ago

matentzn commented 1 year ago

Right now, https://github.com/INCATools/ontology-development-kit/blob/master/update-constraints.sh fails to update the latest SSSOM versions for some reason. It would be good make it somehow possible to force the latest versions of certain tools, maybe the ones listed in https://github.com/INCATools/ontology-development-kit/blob/master/requirements.txt.lite.

But for now:

Why is the pipeline unable to migrate to sssom?

Here is the log. It is trying, but somehow its not succeeding: https://github.com/INCATools/ontology-development-kit/actions/runs/5529593049/jobs/10087870510 (maybe pandas dependency?)

gouttegd commented 1 year ago

kgx requires Pandas < 2.0.0 but the latest versions of sssom require Pandas >= 2.0.2.

hrshdhgd commented 1 year ago

https://github.com/INCATools/ontology-development-kit/actions/runs/5529593049/jobs/10087870510 seems to have passed. Is this still an issue?

gouttegd commented 1 year ago

It passed, but it didn’t pick the latest version of sssom – that’s what Nico is complaining about.

But there’s nothing we can do about it in the ODK itself.

Either:

Until either one of those things happen, the ODK will be stuck with sssom 0.3.32, the last version which didn’t require pandas >= 2.0.

hrshdhgd commented 1 year ago

Made a PR in KGX : https://github.com/biolink/kgx/pull/462

matentzn commented 1 year ago

Can you chase to make a release as well? I guess after the release, this should be solved, hopefully :D

gouttegd commented 1 year ago

We’ve had a discussion about this on Slack before, but it’s worth having it again on a place where it won’t disappear from view after 3 months:

In this instance, we were lucky that the conflict was easily fixable, because kgx is active and apparently Pandas 2.0 didn’t introduce enough breaking changes that kgx could not allow using it.

But with 43 Python packages that we directly use, resulting in a total (for now) of 302 packages (all dependencies included), we won’t always be that lucky. Sooner or later we’ll run into a version conflict that won’t be easily fixable.

When that happens, either we’ll need to accept that we can’t always stay on the bleeding edge and that some packages must stay a few versions behind at least until the conflict is fixed upstream, or we’ll need to put in place the infrastructure to break down the Python packages we use into several independent “silos“ so that the dependencies in one silo don’t break the dependencies in another silo.

matentzn commented 1 year ago

You mean like having "runner" scripts for tools like sssom that always activate an environment before running? I guess that would be possible!

gouttegd commented 1 year ago

Yes. And yes, that would be possible, there is no doubt about that.

The problems with such siloed environments would be that:

In the case of sssom-py for example, that would allow users to always get the latest sssom tool, if they need to invoke that tool somewhere in their workflows. But imagine that, instead of calling the sssom tool, I instead have a custom Python script that imports the sssom-py package (maybe because I want to do something with mappings that the sssom tool does not support, or maybe because I already have a workflow as a large Python script and I’d rather process my mappings from within that script than invoking a shell command like sssom). In order to work with the siloed latest sssom-py, that script would need to know where is the siloed environment containing sssom-py.

matentzn commented 1 year ago

Oh yes.. Good points!

matentzn commented 5 months ago

I did contemplate once to maintain a list of "dependencies on edge", see https://github.com/INCATools/ontology-development-kit/pull/851, but I will for now abandon this as the update process seems to work well enough recently.