PankratzLab / GenScorePipeline

Genetic Score Pipeline, Merge Extract Pipeline
1 stars 0 forks source link

Create a dummy data set to be used as an example #6

Open kbeutel opened 1 year ago

kbeutel commented 1 year ago
  1. Include our published MDS/AML .meta files in the example data set
  2. Run the 1000G Illumina Omni 2.5 data set through Genvisis to create imputation ready VCFs
  3. Impute to the TOPMed reference panel
  4. Trim down the resulting VCFs to just those variants used in the example .meta files
  5. Generate a dummy phenotype or two:
    • @jlanej Was LTL TelSeq data computed for the 1000G samples in TOPMed??
    • We could simulate something pretty easily in Excel
  6. Include a script to run this, probably include it as a formal integration test
jlanej commented 1 year ago

@jlanej Was LTL TelSeq data computed for the 1000G samples in TOPMed??

I don't believe that it was ... but we could compute it in either TOPMed or, if this is an example dataset, we could compute using the publicly available 1000G WGS https://www.internationalgenome.org/data-portal/sample