Open weizhu365 opened 9 months ago
Yes, thanks for pointing this out. Since mutsize is already the number of repeat units (not number of bp) of the mutation, this filter would not work correctly. I will comment that option and code for now since getting that option to work would require knowing the total bp size of the mutation.
Dear MonSTR developers;
By definition, mutsize is "The size of the mutation (number of repeat units)." However, in https://github.com/gymreklab/STRDenovoTools/blob/master/scripts/qc_denovos.py,
mutations["unit"] = mutations["mutsize"]%mutations["period"] == 0
It excluded dn STRs where mutation size is not multiple of period. Therefore, mutsize is likely to be length of the indel rather then the number of repeat units, which is conflicted with the definition of mutsize.
In the actual MonSTR output, mutsize follows its definition. The application of qc_denovos.py with --filter-step-size will wrongly remove many dnSTRs with a unit of the STR motif.
I think this is a bug in qc_denovos.py. Please correct me if I misunderstood something.
Thanks,
Wei Zhu