WGLab / PennCNV

Copy number vaiation detection from SNP arrays
http://penncnv.openbioinformatics.org
Other
89 stars 53 forks source link

Confidence score updating when merging calls in clean_cnv.pl #114

Closed JinhanZhu1 closed 9 months ago

JinhanZhu1 commented 9 months ago

Dear Dr. Wang,

I'm reaching out for the confidence score update mechanism when two calls got merged. According to this line of code, the script will add up the confidence score of regions that get merged. However, I found that whey there are more than two continuous call regions got merged, the code will only add up the first and last Calls' confidence score as the updated confidence score, for example:

chr22:19961955-19968597 | conf=0.556287 chr22:20011989-21032419 | conf=94.7675 chr22:21102973-21463545 | conf=32.6914

These three calls will result in one region; however, the resulting call's confidence score will be 0.055 + 32.6914 instead of 0.055 + 94.7675 + 32.6914. I found this is due to this line of code which didn't update the $prevconf to $newconf. So even the confidence score after merging the first two regions is updated to 0.055 + 94.7675 and stored in the stack, the $prevconf is still 0.055. I'm not sure whether this is designed to only add the first and last region's confidence score or this is a typo in the code. Really appreciate your insight and help with this!!

Best, Jinhan

kaichop commented 9 months ago

You are right, this should be updated from $prevconf to $newconf.

In practice, most people use the number of SNPs and the length as the way to filter CNVs. confidence score is not very comparable between samples.

On Mon, Dec 18, 2023 at 5:58 PM Zhujh @.***> wrote:

Dear Dr. Wang,

I'm reaching out for the confidence score update mechanism when two calls got merged. According to this line of code https://github.com/WGLab/PennCNV/blob/master/clean_cnv.pl#L96, the script will add up the confidence score of regions that get merged. However, I found that whey there are more than two continuous call regions got merged, the code will only add up the first and last Calls' confidence score as the updated confidence score, for example:

chr22:19961955-19968597 | conf=0.556287 chr22:20011989-21032419 | conf=94.7675 chr22:21102973-21463545 | conf=32.6914

These three calls will result in one region; however, the resulting call's confidence score will be 0.055 + 32.6914 instead of 0.055 + 94.7675 + 32.6914. I found this is due to this line of code https://github.com/WGLab/PennCNV/blob/master/clean_cnv.pl#L98 which didn't update the $prevconf to $newconf. So even the confidence score after merging the first two regions is updated to 0.055 + 94.7675 and stored in the stack, the $prevconf is still 0.055. I'm not sure whether this is designed to only add the first and last region's confidence score or this is a typo in the code. Really appreciate your insight and help with this!!

Best, Jinhan

— Reply to this email directly, view it on GitHub https://github.com/WGLab/PennCNV/issues/114, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3ODP32GHCNRL6UR5CHDYKDDAXAVCNFSM6AAAAABA2IB2ZKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2DONRQGA4TONA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

JinhanZhu1 commented 9 months ago

That make senses, we might only do basic filtering using confidence score. Thanks for updating the script and I appreciate your time and help!