WGLab / PennCNV

Copy number vaiation detection from SNP arrays
http://penncnv.openbioinformatics.org
Other
89 stars 55 forks source link

Is it necessary to train HMM file for exome arrays #21

Closed ruqianl closed 6 years ago

ruqianl commented 6 years ago

Hi Dr. Wang,

I noticed in the help message of the 'detect_cnv.pl', training optimized .hmm model parameters is not recommended. But I guess it would be better to train the model when the data is from exome arrays ?

I have downloaded the HMM file for Illumina HumanCoreExome_v12-A beadchip constrcuted by Szatkiewicz et al from your website. and had run the detection operation with both hmm files (i.e. hhall.hmm and exome.hmm).

With exome.hmm, more CNVs were detected.

I have noticed the only difference in two files lie in the B1 block which is the mean LRR and sd.

The data I'm analysing were from InfiniumCoreExome-24v1-1_A, so I'm not sure whether I should re-train the HMM model or just use the exome.hmm as below is good enough ?

Thank you very much

Best regards, Ruqian

hhall.hmm M=6 N=6 A: 0.936719716 0.006332139 0.048770575 0.000000001 0.008177573 0.000000001 0.000801036 0.949230924 0.048770575 0.000000001 0.001168245 0.000029225 0.000004595 0.000047431 0.999912387 0.000000001 0.000034971 0.000000621 0.000049998 0.000049998 0.000049998 0.999750015 0.000049998 0.000049998 0.000916738 0.001359036 0.048770575 0.000000001 0.948953653 0.000000002 0.000000001 0.000000001 0.027257213 0.000000001 0.000000004 0.972742785 B: 0.950000 0.000001 0.050000 0.000001 0.000001 0.000001 0.000001 0.950000 0.050000 0.000001 0.000001 0.000001 0.000001 0.000001 0.999995 0.000001 0.000001 0.000001 0.000001 0.000001 0.050000 0.950000 0.000001 0.000001 0.000001 0.000001 0.050000 0.000001 0.950000 0.000001 0.000001 0.000001 0.050000 0.000001 0.000001 0.950000 pi: 0.000001 0.000500 0.999000 0.000001 0.000500 0.000001 B1_mean: -3.527211 -0.664184 0.000000 100.000000 0.395621 0.678345 B1_sd: 1.329152 0.284338 0.159645 0.211396 0.209089 0.191579 B1_uf: 0.010000 B2_mean: 0.000000 0.250000 0.333333 0.500000 0.500000 B2_sd: 0.016372 0.042099 0.045126 0.034982 0.304243 B2_uf: 0.010000 B3_mean: -2.051407 -0.572210 0.000000 0.000000 0.361669 0.626711 B3_sd: 2.132843 0.382025 0.184001 0.200297 0.253551 0.353183 B3_uf: 0.010000

exome.hmm

M=6 N=6 A: 0.936719716 0.006332139 0.048770575 0.000000001 0.008177573 0.000000001 0.000801036 0.949230924 0.048770575 0.000000001 0.001168245 0.000029225 0.000004595 0.000047431 0.999912387 0.000000001 0.000034971 0.000000621 0.000049998 0.000049998 0.000049998 0.999750015 0.000049998 0.000049998 0.000916738 0.001359036 0.048770575 0.000000001 0.948953653 0.000000002 0.000000001 0.000000001 0.027257213 0.000000001 0.000000004 0.972742785 B: 0.950000 0.000001 0.050000 0.000001 0.000001 0.000001 0.000001 0.950000 0.050000 0.000001 0.000001 0.000001 0.000001 0.000001 0.999995 0.000001 0.000001 0.000001 0.000001 0.000001 0.050000 0.950000 0.000001 0.000001 0.000001 0.000001 0.050000 0.000001 0.950000 0.000001 0.000001 0.000001 0.050000 0.000001 0.000001 0.950000 pi: 0.000001 0.000500 0.999000 0.000001 0.000500 0.000001 B1_mean: -2.051407 -0.5 0.000000 100.000000 0.32 0.62 B1_sd: 1.329152 0.17 0.159645 0.211396 0.25 0.30 B1_uf: 0.010000 B2_mean: 0.000000 0.250000 0.333333 0.500000 0.500000 B2_sd: 0.016372 0.042099 0.045126 0.034982 0.304243 B2_uf: 0.010000 B3_mean: -2.051407 -0.572210 0.000000 0.000000 0.361669 0.626711 B3_sd: 2.132843 0.382025 0.184001 0.200297 0.253551 0.353183 B3_uf: 0.010000

kaichop commented 6 years ago

Hi Ruqian, There is no need to train a model. I am not familiar with the array but I assume that the exome.hmm gives better performance than default hhall.hmm file. -Kai

On Thu, Jan 4, 2018 at 1:12 AM, Rachael-rq notifications@github.com wrote:

Hi Dr. Wang,

I noticed in the help message of the 'detect_cnv.pl', training optimized .hmm model parameters is not recommended. But I guess it would be better to train the model when the data is from exome arrays ?

I have downloaded the HMM file for Illumina HumanCoreExome_v12-A beadchip constrcuted by Szatkiewicz et al from your website. and had run the detection operation with both hmm files (i.e. hhall.hmm and exome.hmm).

With exome.hmm, more CNVs were detected.

I have noticed the only difference in two files lie in the B1 block which is the mean LRR and sd.

The data I'm analysing were from InfiniumCoreExome-24v1-1_A, so I'm not sure whether I should re-train the HMM model or just use the exome.hmm as below is good enough ?

Thank you very much

Best regards, Ruqian

hhall.hmm M=6 N=6 A: 0.936719716 0.006332139 0.048770575 0.000000001 0.008177573 0.000000001 0.000801036 0.949230924 0.048770575 0.000000001 0.001168245 0.000029225 0.000004595 0.000047431 0.999912387 0.000000001 0.000034971 0.000000621 0.000049998 0.000049998 0.000049998 0.999750015 0.000049998 0.000049998 0.000916738 0.001359036 0.048770575 0.000000001 0.948953653 0.000000002 0.000000001 0.000000001 0.027257213 0.000000001 0.000000004 0.972742785 B: 0.950000 0.000001 0.050000 0.000001 0.000001 0.000001 0.000001 0.950000 0.050000 0.000001 0.000001 0.000001 0.000001 0.000001 0.999995 0.000001 0.000001 0.000001 0.000001 0.000001 0.050000 0.950000 0.000001 0.000001 0.000001 0.000001 0.050000 0.000001 0.950000 0.000001 0.000001 0.000001 0.050000 0.000001 0.000001 0.950000 pi: 0.000001 0.000500 0.999000 0.000001 0.000500 0.000001

B1_mean: -3.527211 -0.664184 0.000000 100.000000 0.395621 0.678345 B1_sd: 1.329152 0.284338 0.159645 0.211396 0.209089 0.191579 B1_uf: 0.010000 B2_mean: 0.000000 0.250000 0.333333 0.500000 0.500000 B2_sd: 0.016372 0.042099 0.045126 0.034982 0.304243 B2_uf: 0.010000 B3_mean: -2.051407 -0.572210 0.000000 0.000000 0.361669 0.626711 B3_sd: 2.132843 0.382025 0.184001 0.200297 0.253551 0.353183 B3_uf: 0.010000

exome.hmm

M=6 N=6 A: 0.936719716 0.006332139 0.048770575 0.000000001 0.008177573 0.000000001 0.000801036 0.949230924 0.048770575 0.000000001 0.001168245 0.000029225 0.000004595 0.000047431 0.999912387 0.000000001 0.000034971 0.000000621 0.000049998 0.000049998 0.000049998 0.999750015 0.000049998 0.000049998 0.000916738 0.001359036 0.048770575 0.000000001 0.948953653 0.000000002 0.000000001 0.000000001 0.027257213 0.000000001 0.000000004 0.972742785 B: 0.950000 0.000001 0.050000 0.000001 0.000001 0.000001 0.000001 0.950000 0.050000 0.000001 0.000001 0.000001 0.000001 0.000001 0.999995 0.000001 0.000001 0.000001 0.000001 0.000001 0.050000 0.950000 0.000001 0.000001 0.000001 0.000001 0.050000 0.000001 0.950000 0.000001 0.000001 0.000001 0.050000 0.000001 0.000001 0.950000 pi: 0.000001 0.000500 0.999000 0.000001 0.000500 0.000001

B1_mean: -2.051407 -0.5 0.000000 100.000000 0.32 0.62 B1_sd: 1.329152 0.17 0.159645 0.211396 0.25 0.30 B1_uf: 0.010000 B2_mean: 0.000000 0.250000 0.333333 0.500000 0.500000 B2_sd: 0.016372 0.042099 0.045126 0.034982 0.304243 B2_uf: 0.010000 B3_mean: -2.051407 -0.572210 0.000000 0.000000 0.361669 0.626711 B3_sd: 2.132843 0.382025 0.184001 0.200297 0.253551 0.353183 B3_uf: 0.010000

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WGLab/PennCNV/issues/21, or mute the thread https://github.com/notifications/unsubscribe-auth/AFptuHdT5QGg7k0LNWU00P-yIhqJ0diKks5tHGvrgaJpZM4RSm6G .