waldronlab / curatedMetagenomicDataCuration

Sample Metadata Curation for curatedMetagenomicData
https://waldronlab.io/curatedMetagenomicDataCuration/
28 stars 23 forks source link

VincentC_2016 Antibiotics Metadata do not match published paper #68

Closed cmirzayi closed 1 year ago

cmirzayi commented 2 years ago

antibiotics_family and antibiotics_current_use appear to be wrong for many of the samples for the VincentC_2016 study.

All 229 samples are coded as yes for antibiotics_current_use but that is not true according to the original paper. A handful of participants never received antibiotics over the course of the study and many did not begin the study on antibiotics.

Additionally there are issues with the classes of antibiotics. For instance, the original paper states that 81 samples in the control group alone were exposed to cephalosporins and yet the data in cMD seem to only have 5 samples with cephalosporins exposure:

                                                                                  study_name
antibiotics_family                                                                 VincentC_2016
  aminoglycosides;beta_lactamase_inhibitors;laxatives;penicillin                               1
  beta_lactamase_inhibitors;carbapenems;cephalosporins;fluoroquinolones;penicillin             2
  beta_lactamase_inhibitors;carbapenems;cephalosporins;penicillin                              1
  beta_lactamase_inhibitors;carbapenems;fluoroquinolones;penicillin                            2
  beta_lactamase_inhibitors;macrolides;penicillin                                              2
  beta_lactamase_inhibitors;penicillin                                                         6
  carbapenems                                                                                  4
  carbapenems;fluoroquinolones                                                                 7
  carbapenems;fluoroquinolones;laxatives                                                       1
  carbapenems;laxatives                                                                        1
  cephalosporins                                                                               2
  fluoroquinolones                                                                             2
  macrolides                                                                                   2

The culprit here seems to be the incomplete metadata available on SRA, but the numbers on SRA also don't seem to quite match what's in cMD either--the SRA data for instance list 4 unique samples with cephalosporins exposure.

Where did the data for antibiotics_family come from?

I've already emailed the corresponding author of the original paper to see if she can provide some clarifications/a more comprehensive metadata table but haven't heard anything back yet. However, it would be helpful to have someone more familiar with the curation of these data look into this as well.

cmirzayi commented 1 year ago

I have heard back from the corresponding author and the corrected metadata and accompanying data dictionary are attached. data_dictionary_09062022.xlsx jgh_09062022.csv

paolinomanghi commented 1 year ago

Thanks @Chloe Mirzayi @.***> , this is great! I will add it asap!

On Fri, Oct 21, 2022 at 12:15 AM Chloe Mirzayi @.***> wrote:

I have heard back from the corresponding author and the corrected metadata and accompanying data dictionary are attached. data_dictionary_09062022.xlsx https://github.com/waldronlab/curatedMetagenomicDataCuration/files/9834260/data_dictionary_09062022.xlsx jgh_09062022.csv https://github.com/waldronlab/curatedMetagenomicDataCuration/files/9834261/jgh_09062022.csv

— Reply to this email directly, view it on GitHub https://github.com/waldronlab/curatedMetagenomicDataCuration/issues/68#issuecomment-1286217957, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGZXSV2MNY33LKTICAXEK6TWEHAANANCNFSM5XEKX3JQ . You are receiving this because you were assigned.Message ID: @.*** com>

azenuser commented 1 year ago

This is solved, I'm closing.