Open joeflack4 opened 5 months ago
@matentzn @twhetzel FYI
Set this to low priority, though perhaps it should be medium.
If this happens in the latest ODK this is a high priority issue.
It does. High it is~
@joeflack4 - in the initial post it says "Right now we are stuck to v1.4.3". What do you mean "stuck" to v1.4.3?
This does not look like a problem with any code in this repository nor like a problem with the ODK.
It looks more like a bug in sssom-py
. When converting to OWL, it has to invert the mappings for some reason, and the inversion fails because sssom-py
seems to expect that the inverted dataset should contain a subject_label
column -- which is not the case here since the original dataset contains only a subject_label
column but not a object_label
column, so the inverted dataset conversely contains only a object_label
column but not a subject_column
.
I can replicate the issue with the latest sssom-py
code. Any attempt to convert to OWL a mapping set that does not have a object_label
column yields the same error as reported in this ticket.
Definitely a sssom-py
bug. @matentzn
A possible workaround until the bug is fixed upstream (and a new version of sssom-py
lands in the ODK) is to use the SSSOM plugin for ROBOT to do the OWL conversion:
robot sssom:inject --create --sssom mondo_exactmatch_omimps.sssom.tsv --direct --output mondo_exactmatch_omimps.sssom.owl
Be advised that it produces a slightly different OWL product than sssom-py convert
(because OWL serialisation of SSSOM is poorly specified): notably, it does not convert the mapping set metadata (things like mapping_set_id
).
Thanks for your help again @gouttegd! Yes, I agree with your assessment.
@twhetzel By stuck, I just mean that as a consequence of this not working with the latest version of ODK, we have to use an older version, v1.4.3
being that version.
Thanks all! Is there an issue for this bug in sssom-py
?
There is now. :) https://github.com/mapping-commons/sssom-py/issues/554
@gouttegd Thanks so much:
@matentzn @twhetzel I suppose now I'll upgrade the GH action and fix it to the current latest version and make sure it works.
@joeflack4 The bug is fixed in sssom-py
, but the ODK has not been updated (yet) to include the latest version of sssom-py
!
We will update the master branch of the ODK early next week (this will in fact be done automatically). Then, we will probably make a ODK 1.5.3 release. That’s when you’ll be able to fix your issue by upgrading to the latest ODK. :)
@joeflack4 I don't see a reason to update the GH Action at this time vs. updating to a new version of ODK when that is released.
FYI, version 1.5.3 of the ODK, which includes the fix to the sssom
Python package, has been released today.
@gouttegd Appreciate that! I'll give it a whirl momentarily.
Note to self to work on this later. The best fix may indeed just be adding a flag or manipulating the command.
Not sure why this error didn't appear in older versions of ODK, but currently getting:
Run make install
make install
shell: sh -e {0}
pip install -r requirements-unlocked.txt
WARNING: The directory '/github/home/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
error: externally-managed-environment
× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
python3-xyz, where xyz is the package you are trying to
install.
If you wish to install a non-Debian-packaged Python package,
create a virtual environment using python3 -m venv path/to/venv.
Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
sure you have python3-full installed.
If you wish to install a non-Debian packaged Python application,
it may be easiest to use pipx install xyz, which will manage a
virtual environment for you. Make sure you have pipx installed.
See /usr/share/doc/python3.1[2](https://github.com/monarch-initiative/omim/actions/runs/11646677953/job/32431010170#step:4:2)/README.venv for more information.
note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.
make: *** [makefile:68: install] Error 1
pip install xyz --break-system-packages
on newer ODKs as with later pythons you cant just pip install into the global environment anymore
In the specific case of this repository (omim) the fix is easy: looking at your requirements-unlocked.txt file, the only Python package you are trying to install is python-dotenv
. That package is provided in ODK 1.5, so that install
step of yours is now completely unnecessary.
on newer ODKs as with later pythons you cant just pip install into the global environment anymore
To further explain:
Compared to ODK 1.4.x, two things have changed with the later ODK versions:
/usr/lib/python/...
.pip
by defaults flatly refuses (even if you do have administrative privilege) to install anything in the system directories — only the distribution packaging tools (e.g. apt
and the like) are allowed to do that, which means that the only packages you can install in the global (system-wide) environment are those that are provided by the distribution in the form of DEB packages.So if you do need Python packages that are not already provided in the ODK nowadays, what can you do?
(A) The fastest fix is to call pip
with --user --break-system-packages
; --user
to install the packages “user-wide” rather than “system-wide”, and --break-system-packages
to force pip
to install packages in the global environment (yes, this is needed even with --user
; the “global environment” comprises both the ”system” packages installed in /usr/lib/python/...
and the “user” packages installed in ~/.local/lib/python/...
, so by default pip
will even refuse to install anything in your own user’s directory with the --break-system-packages
option).
(B) Create a virtual environment and install packages inside that environment. That’s the Python recommended way, though it is completely ridiculous and overkill in the context of the ODK (which is already a virtual environment).
(C) Bribe ODK developers to convince them to add the packages you need in the next version of the ODK. :)
(D) See the light and realize your life would be much easier if you didn’t have to deal with anything related to Python, and switch to another language. :D
That package is provided in ODK 1.5, so that install step of yours is now completely unnecessary.
I thought of that, but for purposes of simplicity / future-proofing, as codebase changes and more packages may be added, ideally I just leave that step there and never think about touching it.
@matentzn I remember you having to use --break-system-packages
before, and I did notice it in the error messaging. Good to know that this should work.
@gouttegd Thanks for the background explanation. I figured something like that must have happened. I'll be sure to use --user --break-system-packages
!
Overall, if a new package is needed critically I'd go with (A) and also (C) to make requests to have these included in ODK.
Anything needed that is not part of ODK should also be added into the README (just saying for completeness).
Overview
As of now, I'm getting errors if I run on
obolibrary/odkfull:
latest
ordev
. Right now we are stuck tov1.4.3
.Sub-tasks
latest
latest
run.sh
, or perhaps there it might indeed be best to usedev
orlatest
Error details
Err:
KeyError: "['subject_label'] not in index"
Happens when: creatingmondo_exactmatch_omimps.sssom.owl
Details
``` sh run.sh make mondo_genes.robot.tsv; beep sssom convert mondo_exactmatch_omimps.sssom.tsv -O owl -o mondo_exactmatch_omimps.sssom.owl /usr/local/lib/python3.10/dist-packages/sssom/util.py:168: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)` df.replace("", np.nan, inplace=True) Traceback (most recent call last): File "/usr/local/bin/sssom", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/sssom/cli.py", line 161, in convert
convert_file(input_path=input, output=output, output_format=output_format)
File "/usr/local/lib/python3.10/dist-packages/sssom/io.py", line 50, in convert_file
write_func(doc, output, serialisation=fileformat) # type:ignore
File "/usr/local/lib/python3.10/dist-packages/sssom/writers.py", line 162, in write_owl
graph = to_owl_graph(msdf)
File "/usr/local/lib/python3.10/dist-packages/sssom/writers.py", line 173, in to_owl_graph
msdf.df = invert_mappings(
File "/usr/local/lib/python3.10/dist-packages/sssom/util.py", line 1462, in invert_mappings
inverted_df = inverted_df[df.columns]
File "/usr/local/lib/python3.10/dist-packages/pandas/core/frame.py", line 4108, in __getitem__
indexer = self.columns._get_indexer_strict(key, "columns")[1]
File "/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py", line 6200, in _get_indexer_strict
self._raise_if_missing(keyarr, indexer, axis_name)
File "/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py", line 6252, in _raise_if_missing
raise KeyError(f"{not_found} not in index")
KeyError: "['subject_label'] not in index"
make: *** [makefile:23: mondo_exactmatch_omimps.sssom.owl] Error 1
```