DEploid-dev / dEploidPaper

0 stars 0 forks source link

compute the expected error rate from panel and the data #6

Closed shajoezhu closed 7 years ago

shajoezhu commented 7 years ago

PG0390-C

length(which(panel$dd2gt.from.regression == 1 & coverage$altCount==0))/length(panel$dd2gt.from.regression) [1] 0.007108239 length(which(panel$dd2gt.from.regression == 0 & coverage$altCount>0))/length(panel$dd2gt.from.regression) [1] 0.01540118

shajoezhu commented 7 years ago

PG0389-C,

dd2 will definitely be wrong at Ref 3d7/dd2 is 0/1, but there is alt read in the data

length(which(panel$dd2gt.from.regression == 1 & coverage$altCount==0))/length(panel$dd2gt.from.regression) [1] 0.09504577

dd2 or 3d7 will be wrong at Ref is 0/0,

length(which(panel$dd2gt.from.regression == 0 & coverage$altCount>0))/length(panel$dd2gt.from.regression) [1] 0.01357027

shajoezhu commented 7 years ago

PG0402-C, hb3/7g8, will definitely be wrong at 0/0

length(which(panel$HB3gt.from.regression == 0 & panel$sevenG8gt.from.regression == 0 & coverage$altCount>0))/length(panel$dd2gt.from.regression) [1] 0.01631664

hb3/7g8, will definitely be wrong at 1/1

length(which(panel$HB3gt.from.regression == 1 & panel$sevenG8gt.from.regression == 1 & coverage$altCount==0))/length(panel$dd2gt.from.regression) [1] 0.0001077006

Hb3 is definitely wrong at

length(which(panel$HB3gt.from.regression == 1 & panel$sevenG8gt.from.regression == 0 & coverage$altCount==0))/length(panel$dd2gt.from.regression) [1] 0.0008616047

7g8 is definitely wrong at

length(which(panel$HB3gt.from.regression == 0 & panel$sevenG8gt.from.regression == 1 & coverage$altCount==0))/length(panel$dd2gt.from.regression) [1] 0.02261712

7g8 is minor strain, hb3/7g8 as 0/1, and the data strongly suggest alt

length(which(panel$HB3gt.from.regression == 0 & panel$sevenG8gt.from.regression == 1 & coverage$altCount>30))/length(panel$dd2gt.from.regression)