RabadanLab / arcasHLA

Fast and accurate in silico inference of HLA genotypes from RNA-seq
GNU General Public License v3.0
113 stars 49 forks source link

[IndexError] arcasHLA `reference` -- initial set up #116

Closed jinhys closed 7 months ago

jinhys commented 8 months ago

Hi,

I'm having an IndexError in initializing arcasHLA with the reference version set up (as recommended in the README) as I attached below. Could you please share any solutions for this issue?

command

./arcasHLA reference --version 3.24.0 -v

error message

[reference] Processing IMGT/HLA database

Traceback (most recent call last):
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 580, in <module>
    build_fasta()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 432, in build_fasta
    utrs, exons, final_exon_length) = process_hla_dat()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 223, in process_hla_dat
    length = get_mode(lengths)
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 77, in get_mode
    return stats.mode(lengths)[0][0]

IndexError: invalid index to scalar variable.

Thanks in advance for your help!

ahmadalajami commented 7 months ago

I am experiencing the same. Have you figured it out? @abuendia

jinhys commented 7 months ago

@ahmadalajami Unfortunately, not yet - I also tried re-installing this tool several times using different approaches, but it doesn't seem to work in my case. I was wondering whether we have to use the Docker environment.

jinhys commented 7 months ago

@abuendia Hi Alejandro, can you please take a close look into this issue? And is there any possibility that this tool can be installed using Conda?

Thanks!

aniketbroad2604 commented 7 months ago

getting the same error.

AGImkeller commented 7 months ago

@roseorenbuch @gitliver @abuendia Looks like something in the HLA reference has changed. Is there someone from your team who can fix it? For example by going back in https://github.com/ANHIG/IMGTHLA version?

abuendia commented 7 months ago

Acknowledged. Will push a fix within the next day.

jinhys commented 7 months ago

@abuendia Greatly appreciated! I'm looking forward to running this tool soon :)

abuendia commented 7 months ago

Hi @jinhys @ahmadalajami @aniketbroad2604 @AGImkeller

Fixed in #120 #121. Please create the conda env specified through the environment.yml using the instructions here.

This pins the correct version of scipy to avoid the bug mentioned. You can also run the tool using the updated Dockerfile which builds the same conda env.

jinhys commented 7 months ago

Thank you so much @abuendia for the great news! I will try re-installing arcasHLA using the conda environment and let you know if it works successfully! Many thanks!! :)

jinhys commented 7 months ago

@abuendia I successfully re-installed the tool, however, I'm still having the same IndexError when I use the same command as above. I included them here again for your reference:

arcasHLA reference --version 3.24.0

error message from running this step:

[reference] Processing IMGT/HLA database
Traceback (most recent call last):
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 580, in <module>
    build_fasta()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 432, in build_fasta
    utrs, exons, final_exon_length) = process_hla_dat()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 223, in process_hla_dat
    length = get_mode(lengths)
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 77, in get_mode
    return stats.mode(lengths)[0][0]
IndexError: invalid index to scalar variable.

I also tried these two commands to rebuild and/or update the reference, but they did not work:

arcasHLA reference --rebuild -v
arcasHLA reference --update -v

error message from running arcasHLA reference --rebuild:

Traceback (most recent call last):
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 574, in <module>
    build_fasta()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 432, in build_fasta
    utrs, exons, final_exon_length) = process_hla_dat()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 223, in process_hla_dat
    length = get_mode(lengths)
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 77, in get_mode
    return stats.mode(lengths)[0][0]
IndexError: invalid index to scalar variable.

error message from running arcasHLA reference --update:

[reference] Processing IMGT/HLA database
Traceback (most recent call last):
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 569, in <module>
    build_fasta()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 432, in build_fasta
    utrs, exons, final_exon_length) = process_hla_dat()
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 223, in process_hla_dat
    length = get_mode(lengths)
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/reference.py", line 77, in get_mode
    return stats.mode(lengths)[0][0]
IndexError: invalid index to scalar variable.

In my case, only arcasHLA extract worked although the next step (i.e., arcasHLA genotype) did not work due to this FileNotFoundError:

error message from running arcasHLA genotype:

Traceback (most recent call last):
  File "/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/genotype.py", line 707, in <module>
    with open(hla_json, 'r') as file:
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/tool/miniconda3/share/arcas-hla-0.5.0-1/scripts/../dat/ref/hla.p.json'

Can you please some advice on solving these errors? Can these errors be solved if we could set up the specific reference version (i.e., v3.24.0) as mentioned in the arcasHLA README?

Please let me know if I missed any important points on running these commands or if you need further details on these.

Thanks!

abuendia commented 7 months ago

Hi @jinhys - Could you please create the conda env through the environment.yml in master? First clone this repo and pull to get the latest changes. In your base conda env, run the following at the root of this repo:

conda env create -f environment.yml
conda activate arcas-hla

Then you can run the scripts within this env, e.g. ./arcasHLA reference --version 3.24.0. conda install arcas-hla -c bioconda points to the last release and should not be used at the moment. I will push a release to bioconda next week with a few other fixes.

jinhys commented 7 months ago

Hi @abuendia, thanks for your quick response and the helpful tips! - I forgot to mention in my earlier reply that I re-installed the arcasHLA based on your reply on using the environment.yml file.

As you suggested, I re-installed using git clone https://github.com/RabadanLab/arcasHLA.git. And great news! Now I can successfully run these steps as follows:

./arcasHLA reference --version 3.24.0 -v
./arcasHLA extract
./arcasHLA genotype
./arcasHLA partial
./arcasHLA reference --update -v

Since it's working well with using the GitHub source, it should probably work well with conda as well after you push this release next week! - If possible, could you please leave another reply when you update these to bioconda?

I appreciate your big help! 👍

abuendia commented 7 months ago

Update in bioconda pending approval by bioconda team of PR https://github.com/bioconda/bioconda-recipes/pull/45718. I'll close out this issue now but please feel free to open new ones. Thanks!