Closed CeciliaDeng closed 5 days ago
Hi @ch4rr0 , could you please help?
Hello, I checked the seqid2taxid.map file for pluspfp and could not find the accession for Pear, GCF_963583255.1.
I also checked the assembly summary file that we fetch from NCBI and the accession did not appear there either.
I just happened to have downloaded the plant
library recently and the accession is present once again. In summary,
the accession may have been temporarily removed by NCBI (I am guessing here) and so was not downloaded during
the database build hence the reason why your queries are coming back empty/incorrect. You may need to build a database
with just the plant library in order to classify your reads. We would be glad to help if you require assistance.
grep -n -H GCF_963583255.1 assembly_summary.txt
assembly_summary.txt:82:GCF_963583255.1 PRJNA1060720 SAMEA111431178 CAUZVO000000000.1 reference genome 23211 23211 Pyrus communis na na latest Chromosome Major Full 2023/11/05 drPyrComm1.1 WELLCOME SANGER INSTITUTE GCA_963583255.1 different https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/963/583/255/GCF_963583255.1_drPyrComm1.1 na na na haploid plant 487337232 487308232 37.500000 17 31 31 NCBI RefSeq GCF_963583255.1-RS_2024_09 2024-09-18 42167 34369 3725 na
Thank you for the confirmation, @ch4rr0 .
Sure! Here's a link to a custom kraken2 database with just the pear accession: https://drive.google.com/file/d/11rw3lc5oQ9XLxFzAoAY4rmVyyBr52PHy/view?usp=sharing
Hi,
I used Kraken2 to classify sequences in pear genomes downloaded from GDR. However, the majority of the sequences (>99%) were classified as Malus (apple), rather than Pyrus (pear). How should I interpret this result? Could it be Pyrus genomes are not included in the Kraken2 database? The DB versions I used are _k2_pluspfp20240904 and _k2_pluspfp20230314, both giving me the same results. Thank you.