fbreitwieser / pavian

🌈 Interactive analysis of metagenomics data
https://doi.org/10.1093/bioinformatics/btz715
385 stars 75 forks source link

Kraken2: The following files did not conform the report format #91

Closed sum732 closed 2 years ago

sum732 commented 2 years ago

Hello Pavian Team,

Thanks for writing this tool.

I am having issues in loading Kraken2 output in Pavian. I am using Pavian 1.2.0 and keep getting following error message: The following files did not conform the report format:

Kraken2 files in native form is like this (top 10 records) C A01130:30:HLJTCDRXY:2:2110:10628:4773 Toxoplasma gondii ME49 (taxid 508771) 200|200 0:37 508771:19 0:29 508771:9 0:17 508771:5 0:7 508771:43 |:| 508771:8 0:5 508771:5 0:15 508771:2 0:4 508771:79 0:48 C A01130:30:HLJTCDRXY:2:2108:12662:3443 Homo sapiens (taxid 9606) 191|191 2759:4 9606:6 2759:2 0:5 2759:11 0:13 9606:5 0:11 9606:4 0:14 2759:3 0:5 2759:12 0:5 2759:1 0:11 2759:5 0:24 9606:5 0:11 |:| 0:11 9606:5 0:24 2759:5 0:11 2759:1 0:5 2759:12 0:5 2759:3 0:14 9606:4 0:11 9606:5 0:13 2759:11 0:5 2759:2 9606:6 2759:4 C A01130:30:HLJTCDRXY:1:2165:4255:29027 Moraxella osloensis (taxid 34062) 200|200 0:20 34062:37 468:4 34062:34 475:5 34062:47 0:5 34062:3 0:5 34062:1 0:5 |:| 34062:16 0:6 34062:5 0:7 34062:1 0:5 34062:3 0:5 34062:47 475:5 34062:24 0:31 34062:11

I also tried the MetaPhlAn's output as well resulting in the same issue: d__Bacteria 77837 d__Bacteria|p__Proteobacteria 58254 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria 27241 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales 15845 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae 15830 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas 15776 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas|s__Pseudomonas putida 12856 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas|s__Pseudomonas oryzihabitans 10 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas|s__Pseudomonas monteilii 9 d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Pseudomonadales|f__Pseudomonadaceae|g__Pseudomonas|s__Pseudomonas fulva 4

The output was generated using command as follows: kraken2 --db path/to/Complete_Genomes_RefSeq --threads 12 --classified-out Classified_Sample#.fastq --use-names --unclassified-out UnClassified_Sample#.fastq --output Kraken2_Sample_Classification.txt --report aggregrate_counts_clade --gzip-compressed --use-mpa-style --report Kraken2_MetaPhlAnType.txt --paired S10_R1.fastq.gz S10_R2.fastq.gz

Any suggesting in fixing this issue will be much appreciated.

Many Thanks in advance Deep

sum732 commented 2 years ago

From response of another user, MardahlM (https://github.com/DerrickWood/kraken2/issues/444). I was able to figure it out.