LomanLab / mockcommunity

Long-read mock community experiments
https://lomanlab.github.io/mockcommunity/
103 stars 21 forks source link

WebLink for Bacillus subtilis in sources.txt leads to an entry for Enterococcus faecium #16

Closed kevinxchan closed 5 years ago

kevinxchan commented 5 years ago

Hi, I'm interested in using the R10 dataset for some benchmarking, and one thing I wanted to do was to download all the reference genomes provided in sources.txt.

However, if you follow the web link for Bacillus subtilis (line 6), this leads to an entry for Enterococcus faecalis strain sorialis.

Is this an incorrect entry in sources.txt, or something to be updated in the metadata, or other?

SamStudio8 commented 5 years ago

Hi Kevin, The sources.txt is a legacy file (sorry about that), which is not actually used as part of our analysis. We have Illumina draft assemblies of the isolates. Additionally McIntyre et al. have recently made PacBio isolates draft assemblies available too.

SamStudio8 commented 5 years ago

Bad file removed in 02137f47c495af131aafe3c82365b18b6352f8b3

kevinxchan commented 5 years ago

Thanks for the quick reply. But I'm interested in getting the annotated genomes used in this mock community - do you have a mapping from the strains used to their NCBI entry or something similar?

SamStudio8 commented 5 years ago

Hi Kevin, we have corresponding NRRL accessions for the strains (see our preprint), but there is not necessarily a fully annotated genome for each particular strain in the NCBI, which led us to build/use draft references in our work.