Closed matentzn closed 5 months ago
kgx
requires Pandas < 2.0.0 but the latest versions of sssom
require Pandas >= 2.0.2.
https://github.com/INCATools/ontology-development-kit/actions/runs/5529593049/jobs/10087870510 seems to have passed. Is this still an issue?
It passed, but it didn’t pick the latest version of sssom
– that’s what Nico is complaining about.
But there’s nothing we can do about it in the ODK itself.
Either:
sssom
relaxes its dependency on pandas
to accept versions lower than 2.0, orkgx
starts supporting pandas
>= 2.0.Until either one of those things happen, the ODK will be stuck with sssom
0.3.32, the last version which didn’t require pandas
>= 2.0.
Made a PR in KGX : https://github.com/biolink/kgx/pull/462
Can you chase to make a release as well? I guess after the release, this should be solved, hopefully :D
We’ve had a discussion about this on Slack before, but it’s worth having it again on a place where it won’t disappear from view after 3 months:
In this instance, we were lucky that the conflict was easily fixable, because kgx
is active and apparently Pandas 2.0 didn’t introduce enough breaking changes that kgx
could not allow using it.
But with 43 Python packages that we directly use, resulting in a total (for now) of 302 packages (all dependencies included), we won’t always be that lucky. Sooner or later we’ll run into a version conflict that won’t be easily fixable.
When that happens, either we’ll need to accept that we can’t always stay on the bleeding edge and that some packages must stay a few versions behind at least until the conflict is fixed upstream, or we’ll need to put in place the infrastructure to break down the Python packages we use into several independent “silos“ so that the dependencies in one silo don’t break the dependencies in another silo.
You mean like having "runner" scripts for tools like sssom that always activate an environment before running? I guess that would be possible!
Yes. And yes, that would be possible, there is no doubt about that.
The problems with such siloed environments would be that:
In the case of sssom-py
for example, that would allow users to always get the latest sssom tool, if they need to invoke that tool somewhere in their workflows. But imagine that, instead of calling the sssom
tool, I instead have a custom Python script that imports the sssom-py
package (maybe because I want to do something with mappings that the sssom
tool does not support, or maybe because I already have a workflow as a large Python script and I’d rather process my mappings from within that script than invoking a shell command like sssom
). In order to work with the siloed latest sssom-py
, that script would need to know where is the siloed environment containing sssom-py
.
Oh yes.. Good points!
I did contemplate once to maintain a list of "dependencies on edge", see https://github.com/INCATools/ontology-development-kit/pull/851, but I will for now abandon this as the update process seems to work well enough recently.
Right now, https://github.com/INCATools/ontology-development-kit/blob/master/update-constraints.sh fails to update the latest SSSOM versions for some reason. It would be good make it somehow possible to force the latest versions of certain tools, maybe the ones listed in https://github.com/INCATools/ontology-development-kit/blob/master/requirements.txt.lite.
But for now:
Why is the pipeline unable to migrate to sssom?
Here is the log. It is trying, but somehow its not succeeding: https://github.com/INCATools/ontology-development-kit/actions/runs/5529593049/jobs/10087870510 (maybe pandas dependency?)