Open jmcmurry opened 7 years ago
@kshefchek I think you are assembling datasets (right?). Here is a link for some that might be relevant to cancers that develop after people are treated for an initial cancer. Some of the competency questions will need these types of datasets. https://dceg.cancer.gov/research/what-we-study/second-cancers#Treatment
@mellybelly check this out. I would like to contact her and see if we can explore collaboration. https://dceg.cancer.gov/about/staff-directory/biographies/K-N/morton-lindsay
@jmcmurry @mellybelly @pnrobinson @kshefchek The new gnome dataset was posted today. Peter and I will be working on letter to the PI for collaboration. https://macarthurlab.org/2017/02/27/the-genome-aggregation-database-gnomad/
data set useful for basic science CQs? [ @mbrush here are some things I'm working on FYI]
http://www.reactome.org Dbgap (germline cancer) https://www.ncbi.nlm.nih.gov/gap
[ ] BioBank (MEH Nancy Cox at Vanderbilt in a contact)
[ ] TGEN? (MEH contact Sampath Rangasamy from Keystone Rare Disease meeting)
[ ] HNSCC dataset at NIH (MEH will contact--ask for friend of contact Sampath Rangasamy from Keystone Rare Disease meeting)
[ ] Nazneen.Rahman@icr.ac.uk MEH Contact for dataset w PTEN mutations. What is relationship between PTEN and FA? Cowden Syndrome. Heard talk by colleague Katrina Tatton-Brown at GRD17 on "overgrowth genes" Note that PTEN has put that prevent nuclear import and there is a cnx w FA. see e.g., https://www.ncbi.nlm.nih.gov/pubmed/27819275
FA
[x] Blanche Alter. Out of country till March 21. Reminded March 27, reconnected April 5, date to reconnect when I get back. Need to have a plan for what data is needed and in what format.
[x] Steve Myen Sickkids, agreed to share at Keystone Rare Disease Meeting. Reminded April 6, cc to Tricia. No response to multiple asks.
[ ] Hans Joenje. Will meet and discuss in person April 11. Summary: Lab has massive FA dataset. On paper in Dutch. The computer program had a bug and other primary and backup were lost. Needs to e re-done. High priority.
[x] Takata emailed March 27. He responded. MEH Needs to followup.
[x] http://www2.rockefeller.edu/fanconi/ Barry Coller [collerb@mail.rockefeller.edu]. contacted by @mellybelly and followed up 3/27. Phone conf w @mellybelly and Rockefeller group indicated (1) ontology needs to start over from scratch (2) Agata agrees that curation funding is needed.
[ ] Jay Shendure has dataset in yeast with all possible variants tested in certain key domains of BRCA1 and BRCA2. MEH invited him for seminar at OHSU and wrote suggesting explore collaboration. http://krishna.gs.washington.edu/contact.html
Aldehyde
BMF Registries to check
General normal, better than normal, or unknown set
[x] Wellderly dataset https://genomics.scripps.edu/browser/
[ ] Utah dataset
[ ] Regeneron? Seemed interested. Regeneron and GSK has a new collab to seq 500K people. Will need Translator.
[ ] TGEN? (contact Sampath Rangasamy from Keystone Rare Disease meeting)
[ ] Grand Opportunity datasets. https://esp.gs.washington.edu/drupal/
Cancer
Here I will be listing interesting datasets, and ideas for them, from the Wellcome Genomics of Rare Disease Conference. Datasets from certain countries may prioritize greater good over privacy. @jmcmurry @mellybelly @mbrush @kshefchek @pnrobinson
Danish Newborn screening biobank, blood spots. Every Danish baby since 1982 (Benjamin Neale Broad contact)https://www.ncbi.nlm.nih.gov/pubmed/17632694 MIGEN METSIM FINRISK T2D genes/goTD2/SIGMA IBD consortium (Sek K's dataset which I didn't catch) GTEX genotype tissue expression project (contact Beryl Cummings, Broad. Note that Cummings et al has an isoform-level correction for polyA tail bias). Transcriptome DDD dataset (developmental disorders) 8000 patients. (contact Mathew Hurles, Wellcome Trust, Sanger). ENIGMA consortium 13,171 people (population and case control neuroimaging genetics data). Also CHARGE consortium (12,000 people). some GWAS? Can we use or we need WGS or Exome only? Nijmegen (Contact Hans Brunner). These already on our list, right?: Reactome, ClinVar, ClinGen, 1000 Genomes, dbSnp, HGMD Public, LOVD, UniProt, Database of Genomic Variants (DGV), DECIPHER, OMIM, EVS and ExAC. PanelApp and the 100,000 genomes Project https://panelapp.extge.co.uk https://broadinstitute.org.cmap Paul's work on Grey Team. Cancer cells, but use them as little bags of cellular processes. (steps: create synthetic path for FA. 1. map FA gene network. 2. see which genes in network are uprgulated by drugs 3. see which are down regulated. 4. ensure this works in a normal cell. Could lead to repurposing drugs). HiC/5C COSMIC ESP DIDA http://dida.ibsquare.be (poster 44) Daniel Greene MRC Cambridge. dataset of 5815 pts WGS diverse rare diseases. Statistical tool not published yet. DDD8K, 7833 trios in Deciphering Developmental Disorders includes about 100 South Asians. Hilary Martin at Wellcome Trust Sanger is a contact. https://www.ddduk.org https://www.humancellatlas.org This might have info for gene expression in cells of interest, in particular hematopoticotic cells. Also collecting cells from preg terminations to have an atlas during development. http://www.hdbr.org New NIH Clinical Trial on effects of alcohol. Contact and ask re data they plan to collect. Could this be useful for us relevant to FA-Alcohol use and outcomes? https://www.nytimes.com/2017/07/03/well/eat/alcohol-national-institutes-of-health-clinical-trial.html?mcubz=0&_r=0 Shannon Mc said she had a HNSCC dataset. However she said it would only be useful for "if emphasis is LOH, I dont think our data would be helpful as that was not our emphasis"-- we need to discuss options, perhaps with @pnrobinson New release of UK Biobank http://www.ukbiobank.ac.uk/about-biobank-uk/
Variant Validator https://variantvalidator.org/ MutationTaster BeviMed -rare variant association inference bonding and non-coding loci improvement vs existing: (SKAT, ADA, CAST) Opentargets GSK collab w EMI ( Andrew Nightingale. Has ingested quite a bit then gave up bc too complex). NextProt -human expression (does this one show (RNA, protein) expression by tissue?).
CQ matrix is here
List of databases is here; please prioritize and add.