Open jacodela opened 6 years ago
Thanks for your interest in panDB! We do have a copy of the database built for kraken; if you use dropbox I can upload the database and share with you the containing folder (I am out of town at the very moment but I will do that once I get back). However do note that since Kraken assigns reads to the LCA and panDB contains a big collection of species, many of the classification will be nonspecific.
Wei
Thanks for your quick response!
I'm interested in classifying reads from stool mWGS samples, with a particular emphasis in a group of low abundance methanogens with available genomes in PATRIC but no cultured isolates. In your microbiome paper you state that the proportion of classified reads is higher when using panDB
compared to repDB
. Do you think it would be better to use repDB
rather than panDB
?
I would appreciate if you could send me both databases, the dropbox option sounds good.
Thanks!
panDB will classify more reads than reprDB, but not with more taxonomic specificity. For example, when dealing with S. aureus reads, panDB may classify 100 reads and reprDB classifying 90. But panDB will only classify 50/100 to S.aureus, with the rest assigned to Staphylococcus; while reprDB will classify 80/90 to Staphylococcus aureus. This is because when we include more S. aureus genomes, we found more sequences that are similar to other Staphylococcus species, and Kraken is not able to tell which actual species it came from. I would generally recommend using reprDB for Kraken, and panDB for pathoscope pipelines.
I will upload both when I am back.
Just checked, the Kraken indexes won't fit into my dropbox space... We do have the flat databases (fasta) hosted on ftp://ftp.jax.org/zhouw/referenceDB/ Have you considered building the kraken indexes from there? Let me know if you find any difficulties.
Thanks! I'll give it a try
I'm interested in running
panDB
to use withKraken
, however, I only have access to a file server or a compute cluster running SGE. Are you considering a more agnostic implementation ofpanDB
?Alternatively, can you provide access to a built database (such as the one you used in Zhou et al. 2018)?
Thanks.