SBIMB / StellarPGx

Calling star alleles in highly polymorphic pharmacogenes (e.g. CYP450 genes) by leveraging genome graph-based variant detection.
MIT License
30 stars 7 forks source link

Copy number calculation #20

Closed TonyLupara closed 2 years ago

TonyLupara commented 2 years ago

Hi David, I am investigating your program, and try to understand core algorithms.

Could you please explain why do you have additional "av_5pr_in1" region in "scripts/cyp2d6/b37/bin/sv_modules.py" for b37/hg19 genomes? And why do you add "0.15" to "temp_cn" in the same module? Is it empirically calculated?

twesigomwedavid commented 2 years ago

@TonyLupara,

Thanks. I think the best platform to have this discussion is via email as this is not a bug/issue per se.

Here's the shorter version:

av_5pr_in1 region in "scripts/cyp2d6/b37/bin/sv_modules.py": Looks like this remained after the initial testing phase of the pipeline. The goal at the time was to compare the coverage profile with the coverage in the corresponding CYP2D7 region. It is currently not being used in copy number calculations downstream but it does not hurt to have coverage profiles from multiple regions just in case we come across complex structural variation profiles that need further investigation.

Adding 0.15 when calculating temp_cn: We determined this value based on examining copy number profiles in multiple test samples.

TonyLupara commented 2 years ago

@twesigomwedavid, thank you for answer. I get the logic.) Where can I get your email or social to ask about some details in the future?

twesigomwedavid commented 2 years ago

@TonyLupara My contact details are in the Readme in the main repository