vrmarcelino / CCMetagen

Microbiome classification pipeline
GNU General Public License v3.0
64 stars 19 forks source link

TypeError: 'NoneType' object is not iterable #48

Closed fernanarr closed 1 year ago

fernanarr commented 1 year ago

Hi,

I've indexed SILVA database for KMA and I've successfuly obtained results from the KMA run.

When I try to run CCMetagen with CCMetagen.py -i kma/salida_kma/sample_out_kma_Matriz_sample_BC11_2.res -o kma/resultados/results_sample_out_kma_Matriz_sample_BC11_2 I get this error:

Reading file kma/salida_kma/sample_out_kma_Matriz_sample_BC11_2.res

Traceback (most recent call last):
  File "/home/fer/anaconda3/envs/py36/bin/CCMetagen.py", line 274, in <module>
    df = fParseKMA.populate_w_tax(df, ref_database, st, gt, ft, ot, ct, pt)
  File "/home/fer/anaconda3/envs/py36/lib/python3.6/site-packages/ccmetagen/fParseKMA.py", line 104, in populate_w_tax
    match_info = fNCBItax.lineage_extractor(match_info.TaxId, match_info, taxfile)
  File "/home/fer/anaconda3/envs/py36/lib/python3.6/site-packages/ccmetagen/fNCBItax.py", line 25, in lineage_extractor
    ranks = ncbi.get_rank(lineage)
  File "/home/fer/anaconda3/envs/py36/lib/python3.6/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 196, in get_rank
    all_ids = set(taxids)
TypeError: 'NoneType' object is not iterable

I can't figure out what might be happening, because the "sample_out_kma_Matriz_sample_BC11_2.res" file has the same structure than other files I've already used without fail.

Could you please help me with this issue?

Thanks in advance

vrmarcelino commented 1 year ago

Hi!

Do your reference sequences from Silva have taxids in the sequence header? If you send the .res file I can have a look.

(I'm travelling so will be a bit slow to reply).

fernanarr commented 1 year ago

Sure!!

Here is our .res file. We have taxids in the sequence header.

Thanks a lot for your help

sample_out_kma_Matriz_sample_BC11_2.zip

vrmarcelino commented 1 year ago

Hi!

I think the issue is that you have reference sequences with a taxid '0':

If you remove those:

grep -v '^0|' sample_out_kma_Matriz_sample_BC11_2.res > temp && mv temp sample_out_kma_Matriz_sample_BC11_2.res

Then CCMetagen works without issues.

I would probably remove all reference sequences without a taxonomy for this analysis (I'd also remove the 'unk_taxid ones).

Let me know how it goes. We haven't extensively tested CCMetagen with amplicon sequences yet.

fernanarr commented 1 year ago

Thanks a lot @vrmarcelino. After removing taxid '0' and 'unk_taxid' CCMetagen worked perfectly.

We have had satisfactory results for our sequences.

The only problem is that it can't generate the .html file. The screen message says that everything is correct and points the path to the files, but there is no file.

Do you have any idea of what might be happening?

CCMetagen.py -i kma/salida_kma/sample_out_kma_concatenado_04052023.res -o kma/resultados/results_sample_out_kma_04052023

Reading file kma/salida_kma/sample_out_kma_concatenado_04052023.res

csv file saved as kma/resultados/results_sample_out_kma_04052023.ccm.csv

/bin/sh: 1: ktImportText: not found
krona file saved as kma/resultados/results_sample_out_kma_04052023.html

Thanks again for your quick and useful responses.

Regards

vrmarcelino commented 1 year ago

Hi!

Yes, this generally happens because CCMetagen is not finding Krona - which is either not installed or not accessible in your $PATH.

To install krona:

wget https://github.com/marbl/Krona/releases/download/v2.7/KronaTools-2.7.tar
tar xvf KronaTools-2.7.tar 
cd  KronaTools-2.7
./install.pl --prefix . 

You then need to make it "findable" by adding the ktImportText script to your $PATH (either temporarily or permanently). Let me know if you have trouble doing it. I am travelling at the moment so might be slower to reply but we'll get there =)

fernanarr commented 1 year ago

Hi :-) You are right (of course) It was all because of the path. Now everything works perfectly. Thanks a lot for your help and for the tool!!

vrmarcelino commented 1 year ago

Happy to hear that it works =)