Error running against the test data set

williamdlees commented 3 years ago

Hello, I have installed IgDiscover on a fresh installation of Ubuntu, using conda as specified in the documentation. Installation went smoothly with no reported errors. Am now getting the error below when running against the test data set http://docs.igdiscover.se/en/stable/testing.html#test

Thanks for your help!

William

igdiscover germlinefilter --whitelist=database/V.fasta --unique-cdr3s=5 --cluster-size=50 --unique-J=3 --cross-mapping-ratio=0.02 --clonotype-ratio=0.12 --exact-ratio=0.12 --cdr3-shared-ratio=0.8 --unique-D-ratio=0.3 --unique-D-threshold=10 --annotate=iteration-01/annotated_V_germline.tab --fasta=iteration-01/new_V_germline.fasta iteration-01/candidates.tab  2> >(tee iteration-01/new_V_germline.log >&2)  > iteration-01/new_V_germline.tab
INFO: 44 unique sequences in whitelist
Traceback (most recent call last):
  File "/root/anaconda3/envs/igdiscover/bin/igdiscover", line 10, in <module>
    sys.exit(main())
  File "/root/anaconda3/envs/igdiscover/lib/python3.8/site-packages/igdiscover/__main__.py", line 95, in main
    to_run()
  File "/root/anaconda3/envs/igdiscover/lib/python3.8/site-packages/igdiscover/__main__.py", line 93, in <lambda>
    to_run = lambda: subcommand(args)
  File "/root/anaconda3/envs/igdiscover/lib/python3.8/site-packages/igdiscover/cli/germlinefilter.py", line 389, in main
    table.insert(i+1, 'closest_whitelist', pd.Series('', index=table.index))
  File "/root/anaconda3/envs/igdiscover/lib/python3.8/site-packages/pandas/core/series.py", line 314, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "/root/anaconda3/envs/igdiscover/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 712, in sanitize_array
    subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype)
  File "/root/anaconda3/envs/igdiscover/lib/python3.8/site-packages/pandas/core/dtypes/cast.py", line 1233, in construct_1d_arraylike_from_scalar
    subarr = np.empty(length, dtype=dtype)
TypeError: Cannot interpret '<attribute 'dtype' of 'numpy.generic' objects>' as a data type
[Mon Feb 15 09:45:59 2021]
Error in rule igdiscover_germlinefilter:
    jobid: 10
    output: iteration-01/new_V_germline.tab, iteration-01/new_V_germline.fasta, iteration-01/annotated_V_germline.tab
    log: iteration-01/new_V_germline.log (check log file(s) for error message)
    shell:
        igdiscover germlinefilter --whitelist=database/V.fasta --unique-cdr3s=5 --cluster-size=50 --unique-J=3 --cross-mapping-ratio=0.02 --clonotype-ratio=0.12 --exact-ratio=0.12 --cdr3-shared-ratio=0.8 --unique-D-ratio=0.3 --unique-D-threshold=10 --annotate=iteration-01/annotated_V_germline.tab --fasta=iteration-01/new_V_germline.fasta iteration-01/candidates.tab  2> >(tee iteration-01/new_V_germline.log >&2)  > iteration-01/new_V_germline.tab
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

williamdlees commented 3 years ago

Seems to be a change in numpy 1.20.0 https://github.com/numpy/numpy/issues/18355 . conda install -c conda-forge numpy=1.19 fixed it for me

marcelm commented 3 years ago

Great, thanks for investigating that yourself and getting back here. I will fix this or change the required Numpy version. (Please leave the issue open until then.)

marcelm commented 3 years ago

This should be fixed now. (By using a newer Pandas version.)

NBISweden / IgDiscover-legacy

Error running against the test data set #111