Illumina / canvas

Canvas - Copy number variant (CNV) calling from DNA sequencing data
Other
121 stars 20 forks source link

Unable to download reference #126

Closed nicokrez closed 5 years ago

nicokrez commented 5 years ago

Hi,

the link http://canvas-cnv-public.s3.amazonaws.com/ leads to an XML sheet, no references can be downloaded there. Hence I constructed my own reference with hg38. How to build the XML, it is not clear to me. How to generate the md5 value? Why is there a difference between "knownBases" and "totalBases"?

Tank you for your help and kind regards, Nicolas

eroller commented 5 years ago

you can download the files like this:

wget http://canvas-cnv-public.s3.amazonaws.com/hg19/WholeGenomeFasta/GenomeSize.xml

md5 is a hash of the sequence string. It can be used for data integrity, but Canvas does not use it so you can safely put a dummy value there. knownBases are non-N bases, but that metadata is not used by Canvas so you can safely put some dummy value there.