quinlan-lab / STRling

Detect novel (and reference) STR expansions from short-read data
MIT License
60 stars 9 forks source link

Merge tid problems #78

Closed hdashnow closed 3 years ago

hdashnow commented 3 years ago

To Do

some hg38 builds have e.g. phix where others do not. this means that bin files can differ by that single tid. here, we try this:

  1. user specifies fasta file indicating minimal genome (without phix). if this is not sent in to merge (via --fasta), then the first bin file indicates the expected genome content.
  2. if a bin file is specified that contains a chromosome not found in the minimal genome (e.g. it has phix), then any tread with that chrom is mapped to tid=-1 (unmapped).
  3. in addition, all tids are remapped to the expected, given target genome ordering

this all takes place in unpack_file