Open Carovanandel opened 7 months ago
I am also currently running into this issue! Keep running into the error:
FileNotFoundError: [Errno 2] No such file or directory: '/home/arcas-hla-0.5.0-1/scripts/../dat/IMGTHLA/hla.dat'
when in fact hla.dat.zip exists
Also encounter the same issue. A workaround is downloading the IMGTHLA database as usual, and manually unzip the zipped files under dat/IMGTHLA/ . Replace the reference.py in scripts/ with the attached script. Use command 'arcasHLA reference --update_static' to write neccessary files for the analysis. reference.zip
Hi,
The IMGT/HLA reference from version 3.56.0 onwards provides large files as zip files, as can be read on the IMGT/HLA github page:
As of Release 3.56.0, due April 2024, all large files (>100MB) will be provided as compressed files rather than utilise Git LFS, which was previously required. This includes the hla.dat, xml/hla.xml and xml/hla_ambigs.xml in the next release. This has been done to simplify the cloning process and also due to escalating and unpredictable costs in providing the files using Git LFS from a public repository. All compressed files will use the [ZIP format](https://en.wikipedia.org/wiki/ZIP_(file_format)). This formatting change will be applied to all branches.
This breaks your code, as files like hla.dat cannot be found as they are zipped. Using IMGT/HLA versions up until 3.55.0 seems to work fine. I have created a pull request to update the reference list in parameters.json to include the IMGT/HLA versions 3.47.0-3.56.0, as they were missing, so the arcasHLA reference --version command works with these versions. However, from 3.56.0 onwards, it does not work anymore. Could you update your code to work with the zipped files?
Thanks in advance!