rgcgithub / clamms

CLAMMS is a scalable tool for detecting common and rare copy number variants from whole-exome sequencing data.
Other
29 stars 10 forks source link

nan in created model.bed? #3

Open JMF47 opened 8 years ago

JMF47 commented 8 years ago

Is it normal to observe -nan in the model.bed file? Also, this model.bed was fitted using 123 samples. Shouldn't the last 6 columns of each row add up to 123? The last 6 rows were all 0 or -nan for this model.bed file. Below is a snippet.

1 810013 810535 -1 0.569 0.964 1 0 0.02744 0.5306 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 847325 847825 -1 0.556 1.000 1 0 0.008827 0.8396 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 847825 848326 -1 0.557 1.000 1 0 0.006241 1.239 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 848326 848826 -1 0.646 1.000 1 0 0.01897 0.7327 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 848826 849327 -1 0.619 1.000 1 0 0.02397 0.7008 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 849327 849827 -1 0.634 1.000 1 0 0.01618 0.7615 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 849827 850328 -1 0.575 1.000 1 0 0.006088 1.011 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1 861302 861393 3 0.640 1.000 0 -nan -nan -nan -nan -nan -nan -nan 0.0 0.0 0.0 1 865535 865716 3 0.645 1.000 0 -nan -nan -nan -nan -nan -nan -nan 0.0 0.0 0.0 1 866419 866469 3 0.645 1.000 0 -nan -nan -nan -nan -nan -nan -nan 0.0 0.0 0.0 1 871152 871276 3 0.670 1.000 0 -nan -nan -nan -nan -nan -nan -nan 0.0 0.0 0.0 1 874420 874509 3 0.620 1.000 0 -nan -nan -nan -nan -nan -nan -nan 0.0 0.0 0.0 1 874652 874840 3 0.695 1.000 0 -nan -nan -nan -nan -nan -nan -nan 0.0 0.0 0.0

rgcgithub commented 7 years ago

NaN's should not be generated in the model.bed file, I would have to see example data to figure out what is causing that. Make sure that you're training the models on the normalized coverage files and that those inputs look normal (i.e. positive, non-zero coverage and no NaNs), particularly for the windows generating NaNs in the model file.

The last 6 columns will be represented as a percentage if computed properly, so they would add up to 100 not 123. Note that the first 6 rows are blacklisted (filtered) exon windows, denoted by -1 in column 4, which is why the last six columns are all zeroes.

Evan