Open yafeng opened 6 years ago
http://hgdownload.cse.ucsc.edu/goldenPath/hg38/phastCons100way/hg38.phastCons100way.bw
https://data.broadinstitute.org/compbio1/PhyloCSFtracks/hg38/latest/PhyloCSF+1.bw https://data.broadinstitute.org/compbio1/PhyloCSFtracks/hg38/latest/PhyloCSF+2.bw https://data.broadinstitute.org/compbio1/PhyloCSFtracks/hg38/latest/PhyloCSF+3.bw https://data.broadinstitute.org/compbio1/PhyloCSFtracks/hg38/latest/PhyloCSF-1.bw https://data.broadinstitute.org/compbio1/PhyloCSFtracks/hg38/latest/PhyloCSF-2.bw https://data.broadinstitute.org/compbio1/PhyloCSFtracks/hg38/latest/PhyloCSF-3.bw
Get the COSMIC database
sftp 'your_email_address@example.com'@sftp-cancer.sanger.ac.uk
Download the data
sftp> get cosmic/grch38/cosmic/v85/CosmicMutantExport.tsv.gz
sftp> exit
wget hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chromFaMasked.tar.gz
tar hg38.chromFaMasked.tar.gz
for chr in {1..22} X Y M; do cat chr$chr.fa.masked >> hg38.chr1-22.X.Y.M.fa.masked; done
varDB2.0 database can be downloaded from:
wget http://lehtiolab.se/Supplementary_Files/VarDB2.zip
Add a command-line option --hg19
or --hg38
so that the workflow can be run under different genome assembly. The following processes need to be modified accordingly:
BLATnovel, phastcons, phyloCSF , annovar
I made a copy of ipaw for hg38 genome. https://github.com/yafeng/proteogenomics-analysis-workflow/commit/839f18545053145ffca8b36723811f691beb6578
Could you provide a copy of ipaw for hg38 genome? Or, latest version is for hg38?
Hi Yafeng,
Could you please also upload varDB2.0 anywhere? I could not download by using your suggestion above.
Thx.
@TnakaNY try this link for VarDB2.0 https://drive.google.com/open?id=1G20qIF60xdJ5zrSbt8a8sd0RKutYxQMC
you need to use ipaw hg38 version, which I uploaded under my github repo. And you need to use conda to set up local environments so that all executive commands can be found. It take some efforts to set up. Otherwise, I suggest you continue to use hg19, which is better maintained.
https://github.com/yafeng/proteogenomics-analysis-workflow/blob/master/ipaw.local.hg38.nf
Thank you, let me try!
The current IPAW pipeline utilises hg19 genome based databases, and the reported coordinates of novel peptides and SAAV peptides are hg19 genomic coordinates. The goal is to make IPAW compatible for latest hg38 genome assembly.