Closed danicahelb closed 1 year ago
also, in the 16S rRNA (V4) assay download files, the key identifier matches the sample id key identifier in the mbio sample file but does NOT contain a column for the clinepidb sample ID.
If a user wants to analyze the 16S mbio data in the context of clinepi participant-level data, the only way for them to map the 16S data to the clinepi participant data is to:
Discussed with @dpbisme on 12/15/2022.
To do:
[x] MALED 2yr:
update PID and SID identifiers in mbio to match what is used on ClinEpi
remove "ClinEpiDB sample ID" variable
[x] MALED diarrhea:
update PID and SID identifiers in mbio to match what is used on ClinEpi
remove "ClinEpiDB sample ID" variable
[x] GEMS1
update PID and SID identifiers in mbio to match what is used on ClinEpi
remove "ClinEpiDB sample ID" variable
[ ] MORDOR
update PID and SID identifiers in mbio to match what is used on ClinEpi (identifiers may not match...)
remove "ClinEpiDB sample ID" variable**
Confirmed fixed for gems, maled healthy & maled diar
MORDOR does not use the same samples on clinepi and mbio
We previously added linkages on ClinEpi & mbio sites for MAL-ED samples with microbiome data(see https://github.com/VEuPathDB/EdaLoadingIssues/issues/44)
The idea here is that users can download data from both sites and merge them together on their own
Currently, the key identifiers (ie, PID, SID) between the 2 sites are different. This can cause confusion as users will not understand why the samples are attached to different participant IDs.
In the future, I expect this will cause issues when we implement cross-silo queries on the websites
For example, here is what ClinEpi data from MAL-ED diarrhea samples look like:
And here is what mbio data from MAL-ED diarrhea samples look like:
When a study appears in both ClinEpi and mbio we should ensure that the key identifiers are in the same format across both sites