khyox / recentrifuge

Recentrifuge: robust comparative analysis and contamination removal for metagenomics
http://www.recentrifuge.org
Other
86 stars 7 forks source link

divided by zero error - fault by using krakenuniq? #18

Closed BioNij closed 4 years ago

BioNij commented 5 years ago

Bug report

Bug summary

I am trying to run "rextract" on my data and it always gets a "division by zero" error.

Running Recentrifuge

Command line

rextract command

/mnt/sfb900nfs/groups/tuemmler/erik/rare_validation/recentrifuge/recentrifuge/rextract -f ${file}_output.txt -i 28132 -q $file -n /mnt/sfb900nfs/groups/tuemmler/erik/rare_validation/recentrifuge/recentrifuge/taxdump

rcf command

/mnt/sfb900nfs/groups/tuemmler/erik/rare_validation/recentrifuge/recentrifuge/rcf -k $input -n db/taxonomy/

Data

I have used the output generated by krakenuniq. It has failed with all my used files.

Actual outcome

=-= /mnt/sfb900nfs/groups/tuemmler/erik/rare_validation/recentrifuge/recentrifuge/rextract =-= v0.28.13 - October 2019 =-= by Jose Manuel Martí =-=

Loading NCBI nodes... OK! 
Loading NCBI names... OK! 
Building dict of parent to children taxa... OK! 
List of taxa (and below) to be explicitly included:
        Id  Scientific Name
        28132   Prevotella melaninogenica
Building taxonomy tree... OK!
Filtering taxa... OK!
  5 taxid selected in 2 different taxonomical levels:
  Number of different SPECIES: 1
  Number of different NO_RANK: 4
Loading output file TrackCF_01_S1_R1.fastq_output.txt... OK!
  Load elapsed time: 0.33 sec
Traceback (most recent call last):
  File "/mnt/sfb900nfs/groups/tuemmler/erik/rare_validation/recentrifuge/recentrifuge/rextract", line 347, in <module>
    main()
  File "/mnt/sfb900nfs/groups/tuemmler/erik/rare_validation/recentrifuge/recentrifuge/rextract", line 241, in main
    print(f'  \033[90mMatching reads: \033[0m{len(records):_d} \033[90m\t'
ZeroDivisionError: division by zero

Versions

khyox commented 5 years ago

Thanks for the report, Erik. Recentrifuge is not currently supporting Krakenuniq. I have no personal experience with that classifier, but if you can direct it to generate its output files in the typical Kraken format, you could use rcf to parse the files straightforwardly. Another solution could be to use Recentrifuge's generic parser with your (already generated) Krakenuniq files. Please see running Recentrifuge for a generic classifier for more details. If you need further help with this, feel free to send me (or copypaste here) the head and some representative lines of your output files so that I could figure it out.

ganiatgithub commented 4 years ago

Hi,

Thanks for the suggestion, and this is with regard to the issue I posted on centrifuge https://github.com/DaehwanKimLab/centrifuge/issues/190

My command of running rextract is:

conda activate recentrifuge rextract -f centrifuge-hvc.txt -i 694009 -q Run02_filtered.fastq -n ~/miniconda/envs/recentrifuge/bin/taxdump/

I got the same zero division error: Traceback (most recent call last): File "/home/Staff/uqgni1/miniconda2/envs/recentrifuge/bin/rextract", line 347, in <module> main() File "/home/Staff/uqgni1/miniconda2/envs/recentrifuge/bin/rextract", line 241, in main print(f' \033[90mMatching reads: \033[0m{len(records):_d} \033[90m\t' ZeroDivisionError: division by zero

When I do head on my centrifuge output, it looks like:

name taxID taxRank genomeSize numReads numUniqueReads abundance

Homo sapiens 9606 species 3272089205 11164 8884 0.0

Human alphaherpesvirus 2 10310 species 154675 1 1 0.0

Cercopithecine alphaherpesvirus 2 10317 species 150715 1 0 0.0

Bovine alphaherpesvirus 1 10320 species 135301 8 0 0.0

This should be fine, right?

Any tips on troubleshooting?

khyox commented 4 years ago

Hi @ganiatgithub,

Please open a new issue so that you fill all the required data that will help me to troubleshoot this issue.

In addition, the output of centrifuge that you show is different from what Recentrifuge expects so, please, when you open the new issue, please add also the complete line of the Centrifuge command that you used to get such centrifuge output.

Thank you!