PATRIC3 / patric3_website

Legacy PATRIC Website (JBoss Portal Version)
MIT License
5 stars 2 forks source link

Contamination of metadata fields-Geographic location and site where genome was sequenced #442

Closed ARWattam closed 9 years ago

ARWattam commented 9 years ago

I have found that while PATRIC had a lot of data with Boston, MA as the geographic location. As I know that is the site of the Broad Institute, and that the sequencing was done there, I started checking BioSample to see if the data was consistent. Its not. Here's what we have: screen shot 2015-09-22 at 8 00 33 am Here's what BioSample has: screen shot 2015-09-22 at 8 00 07 am I'm not sure how we fix this, but we need to do it.

mshukla1 commented 9 years ago

I believe we have the correct metadata and NCBI is missing it. Here is why:

https://app.box.com/files/0/f/2663838781/1/f_23723024391

https://www.broadinstitute.org/science/projects/gscid/white-papers'

Carbapenem Resistant bacteria Whole Genome Sequence Analysis of Carbapenem-resistant Enterobacteriaceae Isolated from 3 Boston Area Medical Institutions

Furthermore, in the white paper, it says:

https://www.broadinstitute.org/files/shared/genomebio/Carbapenem_resistant_bacteria_WP.pdf

"In phase I, we will compare the genomic sequences of prospectively sampled carbapenemresistant enterobacteria in three major medical institutions in Boston, and will monitor changes in sequenced isolates over a six to nine month period within each institution. We will also sequence a number of retrospectively collected isolates of carbapenem-resistant and susceptible strains from each institution for comparative purposes."

Let me know if you find any evidence that suggest it is NOT from one of the Boston area hospitals.

-Maulik

ARWattam commented 9 years ago

Terrific! I'm delighted that we're ahead of NCBI on this one! I'll cross that one of my anxiety list.

Rebecca

----- Original Message ----- From: "mshukla1" notifications@github.com To: "PATRIC3" patric3_website@noreply.github.com Cc: "Rebecca Wattam" wattam@vbi.vt.edu Sent: Tuesday, September 22, 2015 10:38:03 AM Subject: Re: [patric3_website] Contamination of metadata fields-Geographic location and site where genome was sequenced (#442)

I believe we have the correct metadata and NCBI is missing it. Here is why:

https://app.box.com/files/0/f/2663838781/1/f_23723024391

https://www.broadinstitute.org/science/projects/gscid/white-papers'

Carbapenem Resistant bacteria Whole Genome Sequence Analysis of Carbapenem-resistant Enterobacteriaceae Isolated from 3 Boston Area Medical Institutions

Furthermore, in the white paper, it says:

https://www.broadinstitute.org/files/shared/genomebio/Carbapenem_resistant_bacteria_WP.pdf

"In phase I, we will compare the genomic sequences of prospectively sampled carbapenemresistant enterobacteria in three major medical institutions in Boston, and will monitor changes in sequenced isolates over a six to nine month period within each institution. We will also sequence a number of retrospectively collected isolates of carbapenem-resistant and susceptible strains from each institution for comparative purposes."

Let me know if you find any evidence that suggest it is NOT from one of the Boston area hospitals.

-Maulik


Reply to this email directly or view it on GitHub: https://github.com/PATRIC3/patric3_website/issues/442#issuecomment-142307486