cancerit / BRASS

Breakpoints via assembly - Identifies breaks and attempts to assemble rearrangements in whole genome sequencing data.
GNU Affero General Public License v3.0
57 stars 20 forks source link

Centromere/telomere file doesn't agree with example #69

Closed edawson closed 6 years ago

edawson commented 6 years ago

I've created a gist to create the centromere/telomere file for BRASS, but the data I get from UCSC doesn't agree with the example file provided. The only difference I can see is hg19 vs GRCh37.

My code is here: https://gist.github.com/edawson/83381e49ac8e5140d9f5519c91eccf24

Happy to fix and donate the script to the repo if it'd be useful.

keiranmraine commented 6 years ago

@edawson thanks for the gist, we'd happily accept a PR to pull it into the code base.

The documentation does indicate that the format isn't exactly the same, but having the script for the bulk of cases would be a good idea.

FYI you can get a pre-constructed set of reference files from here:

ftp://ftp.sanger.ac.uk/pub/cancer/dockstore/human/CNV_SV_ref_GRCh37d5_brass6+.tar.gz

(note these do NOT have a chr prefix)

edawson commented 6 years ago

Sure! I've created PR https://github.com/cancerit/BRASS/pull/70 for this. Thank you for the reference files; I'll test on those as well.