Closed pavel-shliaha closed 7 years ago
Having problem applying the rescaleForTop3() function. Below is a figure of three datasets: 1) not requantified 2) requantified using SUM method 3) requantified using SUM method with the intensities repredicted by rescaleForTop3()
I had a look at a peptide I thin is requantified incorrectly:
as you can see the sum functionality has decreased the ratio from 0.2883538 to 0.207168 as expected but top3 requantitation has put it back up to 0.3473387. This probably should not happen. The code is in: ...\synapter2paper\kuharev2015\bugs_investigation\for_bug_investigation_requant_top3
It seems that it is not a bug but a feature. rescaleForTop3
has an argument onlyForSaturatedRuns
(default TRUE
) that controls whether unsaturated runs should be rescaled or not. I modified your plotting function and marked all unsaturated peptides with black dots and you can see that they cause the wired pattern after rescaling (because they are not requantified or rescaled at all):
(the right bottom panel was generated with rescaleForTop3(..., saturationThreshold=3e4, onlyForSaturatedRuns=FALSE)
)
You could find the modified code in the same directory.
Another figure for the same fact (please note that the colours have different meanings now: blue: no saturation; red: saturation in one sample (A or B; that is the important fraction that cause the "strange" behaviour in the bottom left panel); green: saturation in both samples:
library (synapter)
library (MSnbase)
combMSNset <- readRDS ("refCombMSNSet.RDS")
satThreshold <- 3e4
## samples
a <- 1:5
b <- 6:10
## fetch peptide of interest
poi <- combMSNset[grep("^AGMVAGVIVNR$", featureNames(combMSNset)),]
## grep isotopic distribution
iso <- fData(poi)[, grep("isotopicDistr", fvarLabels(combMSNset))]
isom <- synapter:::.isotopicDistr2matrix(iso)
isom
## 1_0 1_1 1_2 1_3 2_0 2_1 2_2 2_3 2_4
## isotopicDistr.S130423_05 1378 608 273 NA 25649 15016 5707 1150 NA
## isotopicDistr.S130423_07 1246 728 255 NA 25150 11517 4869 1230 NA
## isotopicDistr.S130423_09 857 357 180 NA 19434 9894 3891 589 NA
## isotopicDistr.S130423_11 930 365 NA NA 19231 10580 4375 NA NA
## isotopicDistr.S130423_13 759 400 147 NA 14761 8035 3321 813 NA
## isotopicDistr.S130423_06 10708 4943 1390 341 88884 63955 27550 6491 NA
## isotopicDistr.S130423_08 7132 3480 760 184 61927 45300 21154 5227 938
## isotopicDistr.S130423_10 5054 2233 644 139 55808 40075 17012 4064 678
## isotopicDistr.S130423_12 4741 2000 746 155 50171 35098 14078 3730 670
## isotopicDistr.S130423_14 3451 1723 500 154 39062 25263 11036 3078 NA
## all samples "a" are not saturated but all of "b"
synapter:::.runsUnsaturated(t(isom), saturationThreshold=satThreshold)
## isotopicDistr.S130423_05 isotopicDistr.S130423_07 isotopicDistr.S130423_09
## TRUE TRUE TRUE
## isotopicDistr.S130423_11 isotopicDistr.S130423_13 isotopicDistr.S130423_06
## TRUE TRUE FALSE
## isotopicDistr.S130423_08 isotopicDistr.S130423_10 isotopicDistr.S130423_12
## FALSE FALSE FALSE
## isotopicDistr.S130423_14
## FALSE
## run requantification and rescaling
req <- requantify(poi, method="sum", saturationThreshold=satThreshold,
onlyCommonIsotopes=FALSE)
top3 <- rescaleForTop3(poi, req, satThreshold, onlyForSaturatedRuns=TRUE)
top3all <- rescaleForTop3(poi, req, satThreshold, onlyForSaturatedRuns=FALSE)
tab <- rbind(exprs(poi), exprs(req), exprs(top3), exprs(top3all))
rownames(tab) <- c("initial", "sum", "top3", "top3all")
knitr::kable(tab)
S130423_05 | S130423_07 | S130423_09 | S130423_11 | S130423_13 | S130423_06 | S130423_08 | S130423_10 | S130423_12 | S130423_14 | |
---|---|---|---|---|---|---|---|---|---|---|
initial | 49781.00 | 44995.00 | 35202.00 | 35481.00 | 28236.00 | 204262.0 | 146102.00 | 125707.00 | 111389.00 | 84267.00 |
sum | 9116.00 | 8328.00 | 5874.00 | 5670.00 | 5440.00 | 51423.0 | 38875.00 | 29824.00 | 26120.00 | 19942.00 |
top3 | 49781.00 | 44995.00 | 35202.00 | 35481.00 | 28236.00 | 130797.1 | 98880.57 | 75858.89 | 66437.57 | 50723.51 |
top3all | 44431.54 | 40590.82 | 28629.98 | 27635.68 | 26514.66 | 250636.6 | 189477.44 | 145362.70 | 127309.34 | 97197.66 |
plot intensities
plot(NA, xlim=c(14, 18), ylim=c(-3, 3),
xlab="intensity sample B", ylab="A/B")
abline(h=c(1, 0, -2), col="#808080")
col <- c("#e41a1c", "#377eb8", "#4daf4a", "#984ea3")
points(log2(tab[, b]), log2(tab[, a]/tab[, b]), col=col, pch=20)
legend("topright", legend=rownames(tab), col=col, pch=20)
ratios <- rbind(exprs(poi)[, a]/exprs(poi)[, b],
exprs(req)[, a]/exprs(req)[, b],
exprs(top3)[, a]/exprs(top3)[, b],
exprs(top3all)[, a]/exprs(top3all)[, b])
rownames(ratios) <- c("initial", "sum", "top3", "top3all")
knitr::kable(ratios)
S130423_05 | S130423_07 | S130423_09 | S130423_11 | S130423_13 | |
---|---|---|---|---|---|
initial | 0.2437115 | 0.3079698 | 0.2800321 | 0.3185324 | 0.3350778 |
sum | 0.1772748 | 0.2142251 | 0.1969555 | 0.2170750 | 0.2727911 |
top3 | 0.3805972 | 0.4550439 | 0.4640458 | 0.5340502 | 0.5566650 |
top3all | 0.1772748 | 0.2142251 | 0.1969555 | 0.2170750 | 0.2727911 |
## exprs before requantification
eBefore <- exprs(poi)
eBefore
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 49781 44995 35202 35481 28236
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 204262 146102 125707 111389 84267
## exprs after requantification
eAfter <- exprs(req)
eAfter
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 9116 8328 5874 5670 5440
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 51423 38875 29824 26120 19942
## grep isotopic information
isotop <- as.matrix(iso)
isotop
## isotopicDistr.S130423_05
## AGMVAGVIVNR "1_0:1378;1_1:608;1_2:273;2_0:25649;2_1:15016;2_2:5707;2_3:1150"
## isotopicDistr.S130423_07
## AGMVAGVIVNR "1_0:1246;1_1:728;1_2:255;2_0:25150;2_1:11517;2_2:4869;2_3:1230"
## isotopicDistr.S130423_09
## AGMVAGVIVNR "1_0:857;1_1:357;1_2:180;2_0:19434;2_1:9894;2_2:3891;2_3:589"
## isotopicDistr.S130423_11
## AGMVAGVIVNR "1_0:930;1_1:365;2_0:19231;2_1:10580;2_2:4375"
## isotopicDistr.S130423_13
## AGMVAGVIVNR "1_0:759;1_1:400;1_2:147;2_0:14761;2_1:8035;2_2:3321;2_3:813"
## isotopicDistr.S130423_06
## AGMVAGVIVNR "1_0:10708;1_1:4943;1_2:1390;1_3:341;2_0:88884;2_1:63955;2_2:27550;2_3:6491"
## isotopicDistr.S130423_08
## AGMVAGVIVNR "1_0:7132;1_1:3480;1_2:760;1_3:184;2_0:61927;2_1:45300;2_2:21154;2_3:5227;2_4:938"
## isotopicDistr.S130423_10
## AGMVAGVIVNR "1_0:5054;1_1:2233;1_2:644;1_3:139;2_0:55808;2_1:40075;2_2:17012;2_3:4064;2_4:678"
## isotopicDistr.S130423_12
## AGMVAGVIVNR "1_0:4741;1_1:2000;1_2:746;1_3:155;2_0:50171;2_1:35098;2_2:14078;2_3:3730;2_4:670"
## isotopicDistr.S130423_14
## AGMVAGVIVNR "1_0:3451;1_1:1723;1_2:500;1_3:154;2_0:39062;2_1:25263;2_2:11036;2_3:3078"
## if we want to handle only saturated runs we have to know which ones are
## unsaturated (this code block is skipped for onlyForSaturatedRuns=FALSE
unsat <- t(apply(isotop, 1, function(x)synapter:::.runsUnsaturated(t(synapter:::.isotopicDistr2matrix(x)), saturationThreshold=satThreshold)))
unsat
## isotopicDistr.S130423_05 isotopicDistr.S130423_07
## AGMVAGVIVNR TRUE TRUE
## isotopicDistr.S130423_09 isotopicDistr.S130423_11
## AGMVAGVIVNR TRUE TRUE
## isotopicDistr.S130423_13 isotopicDistr.S130423_06
## AGMVAGVIVNR TRUE FALSE
## isotopicDistr.S130423_08 isotopicDistr.S130423_10
## AGMVAGVIVNR FALSE FALSE
## isotopicDistr.S130423_12 isotopicDistr.S130423_14
## AGMVAGVIVNR FALSE FALSE
## replace requantified values of unsaturated runs with their original
## intensities
eAfter[unsat] <- eBefore[unsat]
eAfter
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 49781 44995 35202 35481 28236
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 51423 38875 29824 26120 19942
:warning: Maybe it is wrong to include the unsaturated runs for proportion calculation here?!
## calculation proportions
prop <- eAfter/rowSums(eAfter, na.rm=TRUE)
prop
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 0.138327 0.1250281 0.09781621 0.09859147 0.0784597
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 0.1428897 0.1080224 0.0828723 0.07257995 0.05541307
## calculate correction factor
cf <- eBefore/prop
cf
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 359879 359879 359879 359879 359879
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 1429508 1352516 1516876 1534708 1520706
cfm <- rowMeans(cf, na.rm=TRUE)
cfm
## AGMVAGVIVNR
## 915370.9
## calculate new intensities
eNew <- cfm * prop
eNew
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 126620.5 114447.1 89538.11 90247.76 71819.73
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 130797.1 98880.57 75858.89 66437.57 50723.51
## replace unsaturated runs with original values
## (this code block is skipped for onlyForSaturatedRuns=FALSE
eNew[unsat] <- eBefore[unsat]
eNew
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 49781 44995 35202 35481 28236
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 130797.1 98880.57 75858.89 66437.57 50723.51
## exprs before requantification
eBefore <- exprs(poi)
eBefore
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 49781 44995 35202 35481 28236
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 204262 146102 125707 111389 84267
## exprs after requantification
eAfter <- exprs(req)
eAfter
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 9116 8328 5874 5670 5440
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 51423 38875 29824 26120 19942
## calculation proportions
prop <- eAfter/rowSums(eAfter, na.rm=TRUE)
prop
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 0.04544095 0.04151297 0.0292804 0.02826351 0.02711702
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 0.2563306 0.193782 0.1486651 0.1302016 0.09940582
## calculate correction factor
cf <- eBefore/prop
cf
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 1095510 1083878 1202238 1255364 1041265
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 796869.3 753950.2 845571.8 855511.9 847706.9
cfm <- rowMeans(cf, na.rm=TRUE)
cfm
## AGMVAGVIVNR
## 977786.4
## calculate new intensities
eNew <- cfm * prop
eNew
## S130423_05 S130423_07 S130423_09 S130423_11 S130423_13
## AGMVAGVIVNR 44431.54 40590.82 28629.98 27635.68 26514.66
## S130423_06 S130423_08 S130423_10 S130423_12 S130423_14
## AGMVAGVIVNR 250636.6 189477.4 145362.7 127309.3 97197.66
Hey Sebastian I think u pinpointed the problem yourself here. It is indeed incorrect to include saturated runs into calculation of correction factor.
I believe you got confused: onlyForSaturTedRuns refers to whether u have to requantify all peptides or only those above saturation threshold, the correction factor should always be computed on unsaturated peptides only. This is analogous to isotopic correction we perform in theoretical methods.
Hey Sebastian,
please execute this code and tell me what you see
library(synapter)
combMSNSet <- readRDS ("Y://RAW/pvs22//_QTOF_DATA_data3//synapter2paper//kuharev2015//synapter2//output//UDMSE//refcombMSNSet.RDS")
satThresholdIon <- 3e4
satCorrected <- sapply (c ("sat", "sum", "reference"),
function (x) NULL)
satCorrected[[1]] <- combMSNSet
satCorrected[[2]] <- requantify (combMSNSet, method = "sum",
saturationThreshold= satThresholdIon,
onlyCommonIsotopes=FALSE)
satCorrected[[3]] <- requantify (combMSNSet, method = "reference",
saturationThreshold= satThresholdIon)
# function to find requantified row
findDifferentquant <- function (MSnSet1, MSnSet2){
selA <- c ()
for (i in 1:nrow (MSnSet1)) if (any (exprs(MSnSet1)[i, ] != exprs(MSnSet2)[i, ], na.rm = TRUE)) selA <- c (selA, i)
return (selA)
}
diffSum <- findDifferentquant(satCorrected[[1]], satCorrected[[2]])
diffRef <- findDifferentquant(satCorrected[[1]], satCorrected[[3]])
length (diffSum)
length (diffRef)
I see
length (diffSum) [1] 3069
and
length (diffSum) [1] 3069
length (diffRef) [1] 3067
@pavel-shliaha I got the same output. Now I finally understand what you mean with "FFQELR"
and "ESNEITIIINPYRETVCFSVEPVK"
are requantified by "sum"
but not by "reference"
. You mean that the intensities with and without reference requantification are identical.
I looked into "FFQELR"
. Just as reminder the isotopic distribution:
## fetch peptide of interest
poi <- combMSNset[grep("^FFQELR$", featureNames(combMSNset)),]
## grep isotopic distribution
iso <- fData(poi)[, grep("isotopicDistr", fvarLabels(combMSNset))]
m <- synapter:::.isotopicDistr2matrix(iso)
m
# 1_0 1_1 1_2 1_3 1_4 2_0 2_1 2_2 2_3
# isotopicDistr.S130423_05 632 307 NA NA NA 6099 2445 696 NA
# isotopicDistr.S130423_07 NA NA NA NA NA NA NA NA NA
# isotopicDistr.S130423_09 37909 16706 2711 501 174 8502 3463 677 125
# isotopicDistr.S130423_11 NA NA NA NA NA NA NA NA NA
# isotopicDistr.S130423_13 NA NA NA NA NA NA NA NA NA
# isotopicDistr.S130423_06 NA NA NA NA NA NA NA NA NA
# isotopicDistr.S130423_08 NA NA NA NA NA 5538 1675 NA NA
# isotopicDistr.S130423_10 NA NA NA NA NA 4842 1439 484 NA
# isotopicDistr.S130423_12 NA NA NA NA NA 4128 1625 371 NA
# isotopicDistr.S130423_14 NA NA NA NA NA 3386 1290 349 NA
It is a special case: all but one run (S130423_09) are unsaturated. Now the algorithm looks for the best reference run (which is the one with the most unsaturated isotopes: S130423_09 (8 unsaturated and 1 saturated)). Subsequently the new intensity values are calculated for run S130423_09 (all other runs are unsaturated and not touched at all). The correction factor between the unsaturated isotopes of S130423_09 and the reference (S130423_09, too) is 1
. That's why the saturated value is replaced by the identical value (the requantification works: saturated value * reference correction factor
; and reference correction factor = mean ( unsaturated isotopes / unsaturated reference isotopes)
. So in this special case it is expected that the requantificated intensties are identical to the original onces.
requantify (poi, method = "sum", saturationThreshold= 3e4, onlyCommonIsotopes=FALSE)
just removes isotope S130423_09:1_0 (that's why it differs slightly).
For the ESNEITIIINPYRETVCFSVEPVK
the only saturated run is also the reference run (same situation as for FFQELR
above).
I hope the onlyForSaturatedRuns problem is fixed now:
there is a problem with theoretical method correction for some peptides, e.g. ILFDYSK. I have run the code and the correction is in the table presented below
library (synapter) library (MSnbase)
combMSNSet <- readRDS ("Y://RAW/pvs22//_QTOF_DATA_data3//synapter2paper//kuharev2015//synapter2//output//UDMSE//refcombMSNSet.RDS")
satThresholdIon <- 3e4 satCorrected <- sapply (c ("sat", "th.mean", "th.median", "th.weighted.mean"), function (x) NULL)
satCorrected[[1]] <- combMSNSet
satCorrected[[2]] <- requantify (combMSNSet, method = "th.mean", saturationThreshold= satThresholdIon, requantifyAll=FALSE)
satCorrected[[3]] <- requantify (combMSNSet, method = "th.median", saturationThreshold= satThresholdIon, requantifyAll=FALSE)
satCorrected[[4]] <- requantify (combMSNSet, method = "th.weighted.mean", saturationThreshold= satThresholdIon, requantifyAll=FALSE)
xx <- rbind (exprs (satCorrected[[1]])[featureNames (satCorrected[[2]]) == "ILFDYSK", ], exprs (satCorrected[[2]])[featureNames (satCorrected[[2]]) == "ILFDYSK", ], exprs (satCorrected[[3]])[featureNames (satCorrected[[2]]) == "ILFDYSK", ], exprs (satCorrected[[4]])[featureNames (satCorrected[[2]]) == "ILFDYSK", ])
row.names(xx) <- names (satCorrected)
could you please produce for "ILFDYSK" a detailed procedure of what is happening here, like you did for previous peptides when we had a problem?
sorry yet another issue
library (synapter) library (MSnbase)
combMSNSet <- readRDS ("Y://RAW/pvs22//_QTOF_DATA_data3//synapter2paper//kuharev2015//synapter2//output//UDMSE//refcombMSNSet.RDS") satThresholdIon <- 3e4 satCorrected <- sapply (c ("sat", "sum")
satCorrected[[1]] <- combMSNSet
satCorrected[[2]] <- requantify (combMSNSet, method = "sum", saturationThreshold= satThresholdIon, onlyCommonIsotopes=TRUE)
exprs (satCorrected[[2]])[5, ]
poi <- combMSNSet[5,] iso <- fData(poi)[, grep("isotopicDistr", fvarLabels(combMSNSet))] m <- synapter:::.isotopicDistr2matrix(iso) m <- m[apply (m, 1, function (x) any (!is.na (x))), ] m <- m[, apply (m, 2, function (x) all (!is.na (x)))] m <- m[, apply (m, 2, function (x) all (x < saturationThreshold))]
m
*The `th.` problem:**
combMSNset <- readRDS("refCombMSNSet.RDS")
## fetch peptide of interest
poi <- combMSNset[grep("^ILFDYSK$", featureNames(combMSNset)),]
## grep isotopic distribution
iso <- fData(poi)[, grep("isotopicDistr", fvarLabels(combMSNset))]
x <- synapter:::.isotopicDistr2matrix(iso)
saturationThreshold <- 3e4
unsat <- .isUnsaturatedIsotope(x, saturationThreshold=saturationThreshold)
# 1_0 1_1 1_2 2_0 2_1
#isotopicDistr.S130423_05 NA NA NA FALSE FALSE
#isotopicDistr.S130423_07 NA NA NA FALSE FALSE
#isotopicDistr.S130423_09 NA NA NA FALSE FALSE
#isotopicDistr.S130423_11 NA NA NA FALSE FALSE
#isotopicDistr.S130423_13 NA NA NA FALSE FALSE
#isotopicDistr.S130423_06 TRUE TRUE TRUE FALSE FALSE
#isotopicDistr.S130423_08 TRUE TRUE NA FALSE FALSE
#isotopicDistr.S130423_10 TRUE TRUE NA FALSE TRUE
#isotopicDistr.S130423_12 NA NA NA FALSE TRUE
#isotopicDistr.S130423_14 NA NA NA FALSE TRUE
Above we see the first problem: There are not any isotopes below the saturation threshold for the first 5 runs. So we can't predict anything here (explains the NA
/0.0
in the first five columns of your table).
x
# 1_0 1_1 1_2 2_0 2_1
#isotopicDistr.S130423_05 NA NA NA 70214 40605
#isotopicDistr.S130423_07 NA NA NA 64751 41339
#isotopicDistr.S130423_09 NA NA NA 66143 34858
#isotopicDistr.S130423_11 NA NA NA 54542 30211
#isotopicDistr.S130423_13 NA NA NA 47453 30213
#isotopicDistr.S130423_06 12833 5303 955 63133 33950
#isotopicDistr.S130423_08 9116 4281 NA 53849 31806
#isotopicDistr.S130423_10 9077 3418 NA 50046 24914
#isotopicDistr.S130423_12 NA NA NA 40965 20147
#isotopicDistr.S130423_14 NA NA NA 31636 19936
x <- x * unsat
# 1_0 1_1 1_2 2_0 2_1
#isotopicDistr.S130423_05 NA NA NA 0 0
#isotopicDistr.S130423_07 NA NA NA 0 0
#isotopicDistr.S130423_09 NA NA NA 0 0
#isotopicDistr.S130423_11 NA NA NA 0 0
#isotopicDistr.S130423_13 NA NA NA 0 0
#isotopicDistr.S130423_06 12833 5303 955 0 0
#isotopicDistr.S130423_08 9116 4281 NA 0 0
#isotopicDistr.S130423_10 9077 3418 NA 0 24914
#isotopicDistr.S130423_12 NA NA NA 0 20147
#isotopicDistr.S130423_14 NA NA NA 0 19936
And here we see the second problem. Allmost all two-charged ions are saturated (and not used for the prediction) that's why our predicted intensities for S130423_06
and S130423_08
are very low. The runs with unsaturated two-charged ions S130423_10
and S130423_14
yield higher intensities.
The "sum"
problem:
The isotopic matrix of the fifth peptide (LAQANGWGVMVSHR
) is:
2_0 2_1 2_2 2_3 2_4 3_0 3_1 3_2 3_3 3_4
isotopicDistr.S130423_05 24557 19414 8161 3387 1424 66080 59557 33660 21962 9124
isotopicDistr.S130423_07 26498 21050 9577 3939 1922 68076 58367 42972 24453 13588
isotopicDistr.S130423_09 NA NA NA NA NA NA NA NA NA NA
isotopicDistr.S130423_11 16756 16158 5719 2899 1341 56830 53333 30881 16859 7490
isotopicDistr.S130423_13 17117 12276 5233 2072 NA 48460 41008 30369 13261 8023
isotopicDistr.S130423_06 24851 17418 9016 NA NA 65222 56007 37009 NA NA
isotopicDistr.S130423_08 19034 13436 6568 NA NA 52526 44629 32459 NA NA
isotopicDistr.S130423_10 14319 11793 5010 NA NA 55761 47528 25765 NA NA
isotopicDistr.S130423_12 15707 13055 5352 NA NA 49521 45595 26457 NA NA
isotopicDistr.S130423_14 17540 12890 6435 NA NA 44465 43195 29950 NA NA
As you see the third run isotopicDistr.S130423_09
is completely missing. In the current definition of onlyCommonIsotopes=TRUE
that means there is not any isotope present in all runs (because S130423_09
has no isotopes at all).
for sum method can you please no consider runs, for which we dont have peptide identity i.e. all the isotopes are NA. Simply ignore the line or convert to 0. This is the final fix I need to finish the paper. For the theroretical methods we need to think what to do in instances like the one above where requantification is not possible.
Simply ignore the line or convert to 0
Converting to 0 is not advisable. If a line is ignored, this would need to be recorded somewhere, or reported to the used at the very least. The best is to keep all features, but set those that don't return any value to NA
. It is then just a matter of calling filterNA
to remove then afterwards.
sorry guys did not mean to tell you how to code
sorry guys did not mean to tell you how to code
No worries - I just wanted to make sure we stay away from wild 0-imputation.
Ok, now missing runs (runs without any recorded intensity value) are ignored for requantify(..., method="sum", onlyCommonIsotopes=TRUE)
:
combMSNset <- readRDS("refCombMSNSet.RDS")
## fetch peptide of interest
poi <- combMSNset[5,]
## grep isotopic distribution
iso <- fData(poi)[, grep("isotopicDistr", fvarLabels(combMSNset))]
# NONE
exprs(poi)
# FALSE
exprs(requantify(poi, method="sum", saturationThreshold=3e4, onlyCommon=FALSE))
# TRUE
exprs(requantify(poi, method="sum", saturationThreshold=3e4, onlyCommon=TRUE))
S130423_05 | S130423_07 | S130423_09 | S130423_11 | S130423_13 | S130423_06 | S130423_08 | S130423_10 | S130423_12 | S130423_14 | |
---|---|---|---|---|---|---|---|---|---|---|
NONE | 247326 | 270442 | NA | 208266 | 177819 | 209523 | 168652 | 160176 | 155687 | 154475 |
FALSE | 88029 | 101027 | NA | 67222 | 57982 | 51285 | 39038 | 31122 | 34114 | 36865 |
TRUE | 52132 | 57125 | NA | 38633 | 34626 | 51285 | 39038 | 31122 | 34114 | 36865 |
Hey Sebastian. Problems again. Please have a look.
satThresholdIon <- 3e4
combMSNSet2 <- readRDS ("Y://RAW/pvs22//_QTOF_DATA_data3//synapter2paper//kuharev2015//synapter2//output//UDMSE//combMSNSet.RDS")
combMSNSet2 <- requantify (combMSNSet2, method = "sum",
saturationThreshold= satThresholdIon,
onlyCommonIsotopes=FALSE)
getting the following message:
Error in seq.default(from = 1L, to = nall, by = 2L) : wrong sign in 'by' argument
already tried restarting R session and reinstalling synapter. Can you reproduce that?
I never see this kind of error before. It seems that the peptide YATALAK
(row 3412
) has just NA
values (except precursor.mhp.S130423_05
: 737.4174
). I don't know why this happen.
Nevertheless it was a bug, that the functions could not handle entries without any non-NA
value. That is fixes now.
@pavel-shliaha: Why we didn't recognize this before. Where does YATALAK
come from?
lets keep this issue open for now
requantify (readRDS("Y://RAW/pvs22//_QTOF_DATA_data3//synapter2paper//kuharev2015//synapter2_intensity//output//UDMSE//refcombMSNSetNS.RDS"),
method = "th.mean",
saturationThreshold= 3e4)
returns
Error in Mod(z$values) : non-numeric argument to function
could you please have a look
I have currently no access to prot-filesrv1 (networkmanager-strongswan plugin seems to not accept/send the password). Can you send me the refcombMSNSetNS.RDS
via e-mail/dropbox/google drive?
shared the file with you via google drive
Sorry, but I can't reproduce the error. Just works for me. Could you try again and directly call traceback()
after the error. Also the output of sessionInfo()
could be helpful.
@sgibb started having trouble with the theoretical method as soon as I updated synapter. I have shared the file with you through google drive.
testMSNSet <- readRDS ("MSnbaseProblemRequant.RDS")
satThresholdIon <- 3e4
requantify (testMSNSet, method = "th.mean", saturationThreshold= satThresholdIon)
error:
Error in Mod(z$values) : non-numeric argument to function
Sorry guys now it works after restarting R session (but I also restarted before posting). I am not sure but perhaps this intermittent error has smth to do with the BRAIN package.
The prototype of suggested function:
requantify <- function (synapterObj , satThreshold, minIsotopes)
synapterObj - list of synapter objects satThreshold - intensity over which accurate ion recording is not possible, due to saturation minIsotopes - minimum number of isotopes accepted for requantitation
For every peptide the function will find common unsaturated ions from supplied synapter objects, and requqntify EMRT using these ions only. satThreshold is used to determine which ions saturate. minIsotopes is the minimum number of isotopes that are sufficient to requantify (i.e. sometimes only few isotopes will be seen in all samples under saturation and over LOD)
Important: the information on peptide isotopes is not presented in current synapter objects! Pep3D file that is loaded in synapter is filtered so that for each EMRT only one isotope is preserved. This means that synapter object will have to be modified to include this information. Current consensus is that a list should be created each element of which represents a single EMRT and contains information on ion intensities from those EMRTs