Data Commons datasets code to subsample based off the Data Commons project.
For each dataset:
Scarpa et al and also Sean Grimmond from cBioPortal
https://www.cbioportal.org/study/summary?id=panet_arcnet_2017
This is a WGS with 98 samples/patients. Accession EGAS00001001732 is here: - https://ega-archive.org/studies/EGAS00001001732?order=file_types&sort=asc - Please use the ICGC website for applying for access to the data: https://ega-archive.org/dacs/EGAC00001000010
It also has the non-identifiable clinical data available too.
Need to subsample so that all 98 patients/samples take up about 500MB for all FASTQ, CRAM, and vcf files.
Run this in public_wgs folder: python create_placeholder_files.py panet_arcnet_2017_clinical_data.tsv
https://www.cbioportal.org/study/summary?id=panet_jhu_2011
Jiao et al from cBioPortal
Ludwig Center for Cancer Genetics and Howard Hughes Medical Institutions, Johns Hopkins Kimmel Cancer Center, Baltimore, MD 21231, USA.
This has 10 samples.
Need to subsample so that all 10 patients/samples take up about 50MB for all FASTQ, CRAM, and vcf files.
Run this in private_wgs folder: python ../public_wgs/create_placeholder_files.py panet_jhu_2011_clinical_data.tsv
Need to subsample the dataset so that it takes up about 50MB for all FASTQ, CRAM, and count files.
Need to subsample the dataset so that it takes up about 50MB for all raw, processed, and summary files.
Need to subsample the dataset so that it takes up about 500MB for all raw, processed, and summary files.
Need to subsample the dataset so that it takes up about 500MB for all raw, processed, and summary files.
Integrating spatial gene expression and breast tumour morphology via deep learning
https://www.nature.com/articles/s41551-020-0578-x
https://aquila.cheunglab.org/view/9ae7dd552df4fc33c918695e9e9f49cc
Follow information from this tutorial - https://pachterlab.github.io/voyager/articles/visium_10x.html
https://github.com/bryanhe/ST-Net
https://data.mendeley.com/datasets/29ntw7sh4r/5
Triplicates from 23 breast cancer patients with luminal a, luminal b, triple-negative, and HER2-positive subtypes. Please refer to metadata.csv for patient, replicate, and subtype associations.
Using space ranger - files needed for space ranger is explained here - https://www.10xgenomics.com/support/software/space-ranger/analysis/inputs/input-overview
Can use this dataset from 10x Genomics - but you do need to give them your email address: https://www.10xgenomics.com/datasets/pancreatic-cancer-with-xenium-human-multi-tissue-and-cancer-panel-1-standard