Open ByroneCole-SageBionetworks opened 1 year ago
@ByroneCole-SageBionetworks , @kjflynn . Thanks for setting this up, do you know who I should reach out to to find out the new genomic data types that are being supported in INCLUDE?
Hi @thomasyu888 do you mean for V3 or generally? probably for both actually start with @lopierra
@kjflynn For V3 and just generally. Thanks!
Did you mean new data types, or just new data? I don't think we have new genomic data types, just WGS and RNAseq as last time. We will have WGS from de Smith, Hakonarson, and HTP, and RNAseq from Hakonarson and HTP.
Thanks @lopierra . I meant new data types.
Had a discussion internally, and this is the summary:
I have some questions.
Hey Tom,
For your second question, these are three different immunological data types.
Flow cytometry is sorted and counted fluorescently labeled cells. CyTOF is a higher dimension flow cytometry (often called mass cytometry) which uses heavy metal labeling to sort and count cells. Cytokine profiling is a measure of secreted immune-signaling proteins.
I’ll let Pierrette answer the other two bullets.
On Sun, Feb 5, 2023 at 4:59 PM Thomas Yu @.***> wrote:
I have some questions.
- Is there a difference between R01 metablomics and metablomics? If so, what is it?
- What is the difference between Flow Cytometry, CyTOF and Cytokine profiles?
- Did we have a list of the "other sequencing?"
— Reply to this email directly, view it on GitHub https://github.com/include-dcc/DMC_v3_tasks/issues/28#issuecomment-1418338284, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASWIZLUR5LEWLXN72UAVD3WWBEIFANCNFSM6AAAAAATYMR6QE . You are receiving this because you were mentioned.Message ID: @.***>
Thanks! I took what was in this spreadsheet and I did a unique counts on the data_type column (minus the cognitive and clinical data_type_short
) and got this count.
Data Type | # cohorts |
---|---|
Other sequencing (targeted, GWAS, DNA methylation, etc.) | 6 |
Neuroimaging | 4 |
Metabolomics/RO1 metablomics | 4 |
Cytokine profiles | 3 |
Proteomics | 3 |
CyTOF | 2 |
EEG | 1 |
Head/neck MRI | 1 |
Flow Cytometry | 1 |
Pulse wave velocity | 1 |
Sleep - summary, saturation, PSG, etc | 1 |
Home & lab sleep apnea test (Nox A1) | 1 |
sleep - Actigraphy, PSG | 1 |
I'm thinking we could try to find nexflow or CWL workflows for those data types that don't have defined workflows as Cavatica applications to execute on the data. Some questions:
The Assays tab in that same spreadsheet is a little more granular on different types of sequencing, etc.
"R01 metabolomics" just means metabolomics data from a previous R01 grant.
I'm not sure we have enough metadata currently to start setting up workflows. Like there are numerous types of proteomics, and I'm not sure what each cohort has, and probably wouldn't ask for the details until they're actually ready to send data.
I'm also not sure if the plan is to import and harmonize all the actual data, or just make the files available. I guess it depends on the data type and whether multiple cohorts are doing comparable assays that could be analyzed together.
Thanks @lopierra - this is very helpful!
This ticket is specifically to determine data types we would want processed along with whether or not there are existing bioinformatics workflows. I'll take a look at the assays tab and regenerate some numbers.
thanks for doing that! I just added a couple more assays for the Aldinger cohort (we just talked last week and I haven't had a chance to update her info in the other tabs yet). She will have single-cell RNAseq and genotyping of fetal tissue.
Sorry for the long delays, but I took a look from the assays sheet, and it would be helpful if we had a dictionary of assays that cohorts could choose from. That said, here are all the assays that had greater than 1 cohort (Aside from RNASeq and WGS - which have workflows)
Assay | Number of Cohorts |
---|---|
Neuroimaging - volumetric MRI, fMRI, fNIRS, DTI, DSI) | 6 |
Metabolomics/NMR Metabolomics/P4C mass spec metabolomics / R01 metabolomics | 5 |
cytokine / MSD cytokine | 3 |
SOMAscan proteomics / proteomics | 3 |
CyTOF | 2 |
amyloid-PET | 2 |
tau-PET | 2 |
Are these still true:
I'm not sure we have enough metadata currently to start setting up workflows. Like there are numerous types of proteomics, and I'm not sure what each cohort has, and probably wouldn't ask for the details until they're actually ready to send data.
I'm also not sure if the plan is to import and harmonize all the actual data, or just make the files available. I guess it depends on the data type and whether multiple cohorts are doing comparable assays that could be analyzed together.
We have not gotten any more assay data since the Oct 2022 release. However, Korenberg is getting ready to send us data - they have RNAseq, methylation, MRI imaging, cognitive tests, and lab data. I still don't know about harmonization vs. making files available - we should bring this up at Data Implementers at some point. An additional complication is that ABC-DS will not allow their assay data to be displayed in the portal, so I'm not sure we should even count those in the number of cohorts.
Ah I see.... Thanks for the update - will discuss this in the data implementors meeting soon!
Edit: Since we aren't so sure about the metadata available, this might be more helpful:
Internal JIRA tickets: