DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
731 stars 274 forks source link

Pear (pyrus) genomes classified as apple (malus) #889

Closed CeciliaDeng closed 5 days ago

CeciliaDeng commented 1 week ago

Hi,

I used Kraken2 to classify sequences in pear genomes downloaded from GDR. However, the majority of the sequences (>99%) were classified as Malus (apple), rather than Pyrus (pear). How should I interpret this result? Could it be Pyrus genomes are not included in the Kraken2 database? The DB versions I used are _k2_pluspfp20240904 and _k2_pluspfp20230314, both giving me the same results. Thank you.

CeciliaDeng commented 1 week ago

Hi @ch4rr0 , could you please help?

ch4rr0 commented 1 week ago

Hello, I checked the seqid2taxid.map file for pluspfp and could not find the accession for Pear, GCF_963583255.1. I also checked the assembly summary file that we fetch from NCBI and the accession did not appear there either. I just happened to have downloaded the plant library recently and the accession is present once again. In summary, the accession may have been temporarily removed by NCBI (I am guessing here) and so was not downloaded during the database build hence the reason why your queries are coming back empty/incorrect. You may need to build a database with just the plant library in order to classify your reads. We would be glad to help if you require assistance.

grep -n -H GCF_963583255.1 assembly_summary.txt
assembly_summary.txt:82:GCF_963583255.1 PRJNA1060720    SAMEA111431178  CAUZVO000000000.1   reference genome    23211   23211   Pyrus communis  na  na  latest  Chromosome  Major   Full    2023/11/05  drPyrComm1.1    WELLCOME SANGER INSTITUTE   GCA_963583255.1 different   https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/963/583/255/GCF_963583255.1_drPyrComm1.1   na  na  na  haploid plant   487337232   487308232   37.500000   17  31  31  NCBI RefSeq GCF_963583255.1-RS_2024_09  2024-09-18  42167   34369   3725    na
CeciliaDeng commented 6 days ago

Thank you for the confirmation, @ch4rr0 .

ch4rr0 commented 5 days ago

Sure! Here's a link to a custom kraken2 database with just the pear accession: https://drive.google.com/file/d/11rw3lc5oQ9XLxFzAoAY4rmVyyBr52PHy/view?usp=sharing