smdabdoub / kraken-biom

Create BIOM-format tables (http://biom-format.org) from Kraken output (http://ccb.jhu.edu/software/kraken/, https://github.com/DerrickWood/kraken).
MIT License
47 stars 15 forks source link

kraken2? #10

Open raw937 opened 6 years ago

raw937 commented 6 years ago

Does it work with kraken2 outputs?

Cheers and many thanks Rick

smdabdoub commented 6 years ago

Hi Rick, thanks for the question!

From what I've read in the new manual, it should work with the v2 report output. However, I haven't had a chance to build a new database and run any samples, so I can't say for sure.

If you have a few files you've generated that you'd like to share, I would be happy to do some testing.

raw937 commented 6 years ago

Hello Shareef,

YES! YES! I have the perfect data for you. We are having trouble resolving this short read data from 16S between two taxa dominant taxa that are similar but different genus. I would like to use kraken2 against the most current databases (greengenes, silva, RDP) as well NCBIs target loci sequences. Also include our full length 16S from our genomes.

Qiime has been unable to resolve this.

email me - raw937@gmail.com

I can send you the data to play with.

Cheers and many thanks Rick

misazaa commented 6 years ago

Hi!

I'm running kraken 2 to do the taxonomic classification of some microbiome data (assembly of PE reads) and I'm using the greengages databased. I want to get an OTU table in the biom format but I when I run kraken-biom output_kraken_report.txt I get the error

AttributeError: 'NoneType' object has no attribute 'strip'

So I don't know if there is a problem with the lines in kraken2 or the kraken-biom is not compatible with kraken2

Have you tried your data yet?

Camila

smdabdoub commented 6 years ago

Hi Camila,

I have not yet created a kraken2 database for myself yet, so I don't have any example files to test it with. But I would be happy to take a look at the file you're having problems with if you can share it.

Shareef

MaryoHg commented 6 years ago

Hi, all.

I already tried to use kraken-biom on my kraken2-report-style files, and it worked. I haven't understand what is the meaning of the value showed in the biom file for each report added. Is it the number of reads assigned to a taxonomy_id or a taxonomic level itself?

################################################################################

@misazaa

I tried to convert a 'non-indented' file with k-biom script. Is it the same that you get?

Traceback (most recent call last): File "/usr/local/bin/kraken-biom", line 11, in sys.exit(main()) File "/usr/local/lib/python2.7/dist-packages/kraken_biom.py", line 373, in main min_rank=args.min) File "/usr/local/lib/python2.7/dist-packages/kraken_biom.py", line 161, in process_samples min_rank=min_rank) File "/usr/local/lib/python2.7/dist-packages/kraken_biom.py", line 105, in parse_kraken_report erank = entry['rank'].strip() AttributeError: 'NoneType' object has no attribute 'strip'

MaryoHg.

smdabdoub commented 6 years ago

Hi MaryoHg,

The counts recorded in the biom table are counts assigned to a particular taxonomy ID (column 3 in the report file), with one exception. When assigning reads for the lowest taxonomic level (species by default) it uses the number of reads assigned to that taxonomic level and those below (column 2 in the report file).

Shareef

MaryoHg commented 6 years ago

Thank you, @smdabdoub.

Have you ever crossed with Bracken? This software is useful to estimate the relative abundance based on Kraken outputs. It would be great if your script worked on Bracken-outputs too.

Just saying.

With regards,

MaryoHg.

louiejtaylor commented 5 years ago

Testimonial--it works great with kraken2 output :D

mdtorohernando commented 5 years ago

For me, it works with Kraken2 output, but I cannot retrieve taxonomic information (only TaxID). Does somebody know how to solve it?

casperp commented 5 years ago

How do you mean you can not retrieve taxonomic information ? The program produces an biom file. In the "metadata" from every TaxID should be info about the taxonomy like a name per taxonomic level.

You can check if this information is correct by importing the biom table in python and checking :

from biom import load_table
biom_table = load_table("table.biom")
biom_table.metadata_to_dataframe(axis='observation')  

It would also be useful if you can provide a kreport form kraken2. I only have access to krakenuniq reports and no access to a cluster for kraken2 at the moment.

mdtorohernando commented 5 years ago

I mean I obtain the biom file ( I convert to csv to visualize) and I only observe the TaxID column and each sample column, but no more.

casperp commented 5 years ago

Okay, you can look at the manual for biom convertion. When you use bash biom convert -i otu_table.biom -o otu_table.txt --to-tsv --header-key taxonomy you get an extra column with the full linage.

kBacteria; pProteobacteria; cAlphaproteobacteria; oRhizobiales; fBradyrhizobiaceae; gBradyrhizobium; s__diazoefficiens

I hope this is helpfull. Otherwise if you can wait a few weeks, I'm working on a program for visualizing and testing of biom tables. Maybe this provides the information you need in the visualizations.

mdtorohernando commented 5 years ago

Thank you!! I completely forgot the --header-key option! Thanks!

kyoreth commented 4 years ago

I'm having trouble with read numbers in regards to bracken reports, they seem to be different compared to the original report? For example, checking taxa 817 in the report states 292478 taxa read counts, whereas in the converted biom file imported via phyloseq into R, the taxa 817 reads 827894 taxa read counts... Any ideas what could be going wrong?

smdabdoub commented 4 years ago

@kyoreth Do you have an example file I could take a look at? I haven't actually tested kraken-biom with Bracken output. Others have said it works, but it would be very helpful if you could share an example so I could make any changes to the program if needed.

kyoreth commented 4 years ago

@smdabdoub Sorry about my earlier comment, and thanks for the reply! After some more internal testing kraken-biom is working fine with Bracken reports.

Midnighter commented 4 years ago

Hi @casperp, where are you developing your visualization tool? Do you have a repo to look at?

SooChing commented 3 years ago

Hi,

I have two questions:

@louiejtaylor @MaryoHg

  1. May I know how to solve the below error? I firstly combined all my kraken2 reports (from all samples) using 'combine_kreports.py', then run the output using 'kraken-biom all_kraken2.kreport2 -o all_kraken2.biom --fmt json' but failed.

Traceback (most recent call last): File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/bin/kraken-biom", line 10, in sys.exit(main()) File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/lib/python3.7/site-packages/kraken_biom.py", line 373, in main min_rank=args.min) File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/lib/python3.7/site-packages/kraken_biom.py", line 161, in process_samples min_rank=min_rank) File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/lib/python3.7/site-packages/kraken_biom.py", line 105, in parse_kraken_report erank = entry['rank'].strip() AttributeError: 'NoneType' object has no attribute 'strip'

@kyoreth

  1. May I know how to combine bracken reports from all samples before you running the kraken-biom? I have bracken reports but I don't know how to combine it.

Thanks.

Regards, Soo Ching

UmaJan commented 1 year ago

Hi,

I have two questions:

@louiejtaylor @MaryoHg

  1. May I know how to solve the below error? I firstly combined all my kraken2 reports (from all samples) using 'combine_kreports.py', then run the output using 'kraken-biom all_kraken2.kreport2 -o all_kraken2.biom --fmt json' but failed.

Traceback (most recent call last): File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/bin/kraken-biom", line 10, in sys.exit(main()) File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/lib/python3.7/site-packages/kraken_biom.py", line 373, in main min_rank=args.min) File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/lib/python3.7/site-packages/kraken_biom.py", line 161, in process_samples min_rank=min_rank) File "/nethome/lees51/.conda/envs/shotgun_ana3_py3.7.0/lib/python3.7/site-packages/kraken_biom.py", line 105, in parse_kraken_report erank = entry['rank'].strip() AttributeError: 'NoneType' object has no attribute 'strip'

@kyoreth 2. May I know how to combine bracken reports from all samples before you running the kraken-biom? I have bracken reports but I don't know how to combine it.

Thanks.

Regards, Soo Ching

have you figured out how to do this as of yet?

MaryoHg commented 1 year ago

@SooChing @UmaJan

I didn't merge the kraken2 outputs (I changed the k2 outputs for the Bracken's one; those with re-estimated values) before using kraken-biom, as this script can take many reports at the same time as arguments and spit a single biom file with many samples as reports use as inputs:

My command, AFIR (Remember):

$ kraken-biom --kraken_reports_fp dir_reports/ --otu_fp file --max D --min S --output_fp mytable.biom

dir_reports: contains all the reports from bracken (or kraken2) I want to merge into a single biom file for later use. --output-fp: is the biom output file --otu_fp: optional, Create a file containing just the (NCBI) OTU IDs for use with a service such as phyloT

Sorry for the late reply @SooChing. I hope you're doing just fine and made through that error.

Mar.

Midnighter commented 1 year ago

It might be of interest to folks here, we recently published TAXPASTA, a tool that can convert many different profiles into various standardized tabular formats. Among them the BIOM format.