Robaina / MetaTag

metaTag: functional and taxonomical annotation of metagenomes through phylogenetic tree placement
https://robaina.github.io/MetaTag/
Apache License 2.0
1 stars 0 forks source link

ValueError in makedatabase.py #76

Closed micronuria closed 2 years ago

micronuria commented 2 years ago

I am trying to make a database with a mix of hmms, one pfam, that does not require additional arguments, and two KOfams that require them. This is part of a test I am doing for something else, so I have changed the hmmersearch call in wrappers.py, but I have only changed the type of output file, from tblout to domtblout. I do not think this is causing the error.

Because, I got a ValueError; at some point the pipeline tries to convert the Pfam name to float. Probably, I am making a mistake in the module call..... This is the call and the error (forget about the last \ in the call):

(traits) nfernandez@elbrus:/data/mcm/nfernandez/TRAITS$  python3 ./code/makedatabase.py \
>  --in data/databases/final_ref_database.faa \
>  --outdir genes/prueba/results \
>  --hmms genes/prueba/hmms/PF18582.hmm genes/prueba/hmms/K00367.hmm genes/prueba/hmms/K00370.hmm \
>  --hmmsearch_args "None","-T 1033.77","-T 1075.37"\
> 
* Making peptide-specific reference database...
 * Processing hmm PF18582.hmm with additional arguments: --cut_nc
Running Hmmer...
Parsing Hmmer output file...
Traceback (most recent call last):
  File "/data/mcm/nfernandez/opt/envs/traits/lib/python3.9/site-packages/Bio/File.py", line 72, in as_handle
    with open(handleish, mode, **kwargs) as fp:
TypeError: expected str, bytes or os.PathLike object, not TextIOWrapper

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/mcm/nfernandez/TRAITS/./code/makedatabase.py", line 194, in <module>
    main()
  File "/data/mcm/nfernandez/TRAITS/./code/makedatabase.py", line 132, in main
    filterFASTAByHMM(
  File "/data/mcm/nfernandez/TRAITS/code/phyloplacement/database/manipulation.py", line 114, in filterFASTAByHMM
    hmmer_hits = parseHMMsearchOutput(hmmer_output)
  File "/data/mcm/nfernandez/TRAITS/code/phyloplacement/database/manipulation.py", line 62, in parseHMMsearchOutput
    for queryresult in SearchIO.parse(handle, 'hmmer3-tab'):
  File "/data/mcm/nfernandez/opt/envs/traits/lib/python3.9/site-packages/Bio/SearchIO/__init__.py", line 306, in parse
    yield from generator
  File "/data/mcm/nfernandez/opt/envs/traits/lib/python3.9/site-packages/Bio/SearchIO/HmmerIO/hmmer3_tab.py", line 33, in __iter__
    yield from self._parse_qresult()
  File "/data/mcm/nfernandez/opt/envs/traits/lib/python3.9/site-packages/Bio/SearchIO/HmmerIO/hmmer3_tab.py", line 98, in _parse_qresult
    cur = self._parse_row()
  File "/data/mcm/nfernandez/opt/envs/traits/lib/python3.9/site-packages/Bio/SearchIO/HmmerIO/hmmer3_tab.py", line 51, in _parse_row
    hit["evalue"] = float(cols[4])  # evalue (full sequence)
ValueError: could not convert string to float: 'PF18582.4'
micronuria commented 2 years ago

Ok, so it might be related with my TRAITS installation. I have tried in a different machine and it seems to work. I will update the issue or close it when I am done with the tests....

micronuria commented 2 years ago

So the problem is that when you change the output of HMMER3 from --tblout to --domtblout, you have to change the Biopython module that parses them as well. So, I can change the option in the call to HMMER in the fuction runHMMsearch in the module wrappers.py; but then I have to change the line 62 of the module database.manipulation.py that correspond to parseHMMsearchOutput fucntion from this:

for queryresult in SearchIO.parse(handle, 'hmmer3-tab'): 

to this:

for queryresult in SearchIO.parse(handle, 'hmmsearch3-domtab'):

and it works...