Closed jonasscheid closed 8 months ago
It was decided in #82 to not bundle a copy of Unimod with Pyteomics, but to let other libraries that depend upon it being available even when the Unimod web server is not can bundle a specific version with their code and use pyteomics.proforma.set_unimod_path
to resolve Unimod.
Relevant piece from the documentation: https://pyteomics.readthedocs.io/en/latest/api/proforma.html#cv-disk-caching
Thank you @mobiusklein for the recap. Indeed, I was hesitant of including the copy of Unimod and increasing the distribution size 10-fold, even though it's still not a lot.
I am still open to ideas though. I realize that the current solution puts some extra load on the users and it doesn't seem ideal. To me, an ideal solution would be to optionally install a fallback copy at installation time. However, it's not immediately clear to me how to do it cleanly in a way compatible with modern Python packaging tools.
One option would be to depend upon the caching and fallback behavior in psims
when it is available, which would be an install-time option already.
Another would be to write another caching mechanism in pyteomics
whereby a backup copy is downloaded and stored the first time Unimod
is used and updated periodically if the version changes by checking the version tag/digest. This could be done with around 300 lines or less, depending upon whether we aim for full XDG compatibility or just default to ~/.pyteomics_cache
or something similar (or copy https://github.com/matplotlib/matplotlib/blob/main/lib/matplotlib/__init__.py#L510-L637).
I think going the psims
route is better because A) it's already there so there's no duplication of files or effort, B) it automatically uses the same caching that all the other CVs in pyteomics.proforma
do, and C) managing config files and caches is a huge pain for a library.
Some applications might need to keep all their data in one place for easy deletion on uninstall, or they need to keep the reference file consistent so the program doesn't abruptly break because of a remote change.
@mobiusklein thank you for bearing with me. I looked at psims
CV code again and I want to make sure I understand how it works.
As far as I can tell, there are two different mechanisms in there, caching and fallback. While caching needs to be enabled and configured at runtime, fallback to bundled versions is always available and psims
covers Unimod fallbacks seamlessly.
If that is the case and we can add a psims
Unimod resolver to the current functionality, that would be absolutely awesome because psims
, if available, would work even offline and without a local cache from a prior download, but ProForma parsing (with Unimod refs) would still be possible without necessarily installing psims
dependencies.
Thank you! 🙏🏼
Thanks for the great tool!
I ran today in this issue
Since unimod is down this error is thrown. This impacts lots of tools build on pyteomics (deeplc, ms2pip etc.). Is there a way to implement a fall-back option if Unimod is not accessible? Thanks!