HKU-BAL / ClairS

ClairS - a deep-learning method for long-read somatic small variant calling
BSD 3-Clause "New" or "Revised" License
67 stars 7 forks source link

Question in training data label generation code - get_candidates.py #21

Closed quito418 closed 6 months ago

quito418 commented 6 months ago

Dear ClairS Team,

While reviewing the code at the following link: https://github.com/HKU-BAL/ClairS/blob/c464c98c61f594489e385b8bb67c32518dce59f8/src/get_candidates.py#L348 I noticed a potential issue related to the find_candidate_match function. It appears that the function may be utilizing the tumor alt_dict instead of the normal alt_dict. Could you please take a moment to verify this?

Thank you for your time and support!

Best regards,

zhengzhenxian commented 6 months ago

Hi, @quito418,

Billions thanks for reporting this! Should be normal alt_dict(paired_alt_dict). I guess it would only exclude a small proportion of candidates in training. We will try to add those candidates to training to further verify the results.

Zhenxian

quito418 commented 6 months ago

Thanks, @Zhenxian! Looking forward to the release of ClairS.