Open bw2 opened 6 months ago
Thank you for bringing this up! We will add the loci you mentioned and start working on evaluating / improving reference coordinates of known pathogenic repeats.
I've been using the STRchive loci for this. We automate generation of these when the database gets updated. https://github.com/dashnowlab/STRchive/blob/main/data/hg38.STRchive-disease-loci.TRGT.bed
Comparing https://github.com/PacificBiosciences/trgt/blob/main/repeats/pathogenic_repeats.hg38.bed to https://github.com/broadinstitute/str-analysis/blob/main/str_analysis/variant_catalogs/variant_catalog_without_offtargets.GRCh38.json there are some differences in the start and end coords.
Assuming TRGT input format is 0-based for the start coordinate, would it make sense to change the coordinates in
pathogenic_repeats.hg38.bed
as follows?Also, it might be worth adding these loci: