Closed sneumann closed 8 years ago
Hi, I found the issue, when downloading MTBLS213, and running xcmsSet() on the files,
the sampleclasses are missing, but I am unsure which classes are expected
from the geoRge script. Is it just sampclass(xset) <- c(rep("CELL_Glc12", 6), rep("CELL_Glc13", 6))
is not enough, since I still get an error. Can you provide the code to set the right classes ?
Thanks, Steffen
Hi Steffen, thanks for reporting your problem,
I think I know where this issue comes from.
Could you paste here the result from running xset3@phenoData$class
?
Hi Jordi, thanks for the quick response. Here is the phenoData:
> xset3@phenoData$class
[1] CELL_Glc12 CELL_Glc12 CELL_Glc12 CELL_Glc12 CELL_Glc12 CELL_Glc12 CELL_Glc13 CELL_Glc13 CELL_Glc13
[10] CELL_Glc13 CELL_Glc13 CELL_Glc13
Levels: CELL_Glc12 CELL_Glc13
The whole xset3 is:
> xset3
An "xcmsSet" object with 12 samples
Time range: 2.5-1259.8 seconds (0-21 minutes)
Mass range: 100.0112-1499.59 m/z
Peaks: 125327 (about 10444 per sample)
Peak Groups: 10381
Sample classes: CELL_Glc12, CELL_Glc13
Peak picking was performed on MS1.
Profile settings: method = bin
step = 0.1
Memory usage: 13 MB
> phenoData(xset3)
class
CELL_Glc12_05mM_Normo_04 CELL_Glc12
CELL_Glc12_05mM_Normo_05 CELL_Glc12
CELL_Glc12_05mM_Normo_06 CELL_Glc12
CELL_Glc12_25mM_Normo_16 CELL_Glc12
CELL_Glc12_25mM_Normo_17 CELL_Glc12
CELL_Glc12_25mM_Normo_18 CELL_Glc12
CELL_Glc13_05mM_Normo_01 CELL_Glc13
CELL_Glc13_05mM_Normo_02 CELL_Glc13
CELL_Glc13_05mM_Normo_03 CELL_Glc13
CELL_Glc13_25mM_Normo_13 CELL_Glc13
CELL_Glc13_25mM_Normo_14 CELL_Glc13
CELL_Glc13_25mM_Normo_15 CELL_Glc13
Dear Steffen,
I believe you did not use folders to organise the .mzXML files when preprocessing with XCMS, we did not expect that, honestly because we (our lab) always use folders. Since geoRge depends on xset3@phenoData$class
to define the different experimental conditions, this can be solved by creating them manuallyxset3@phenoData$class <- your_values_vector
.
As an example, phenoData should look like this:
> phenoData(xset3)
class
CELL_Glc12_05mM_Normo_04 CELL_Glc12_05mM_Normo
CELL_Glc12_05mM_Normo_05 CELL_Glc12_05mM_Normo
CELL_Glc12_05mM_Normo_06 CELL_Glc12_05mM_Normo
CELL_Glc12_25mM_Normo_16 CELL_Glc12_25mM_Normo
CELL_Glc12_25mM_Normo_17 CELL_Glc12_25mM_Normo
CELL_Glc12_25mM_Normo_18 CELL_Glc12_25mM_Normo
CELL_Glc13_05mM_Normo_01 CELL_Glc13_05mM_Normo
CELL_Glc13_05mM_Normo_02 CELL_Glc13_05mM_Normo
CELL_Glc13_05mM_Normo_03 CELL_Glc13_05mM_Normo
CELL_Glc13_25mM_Normo_13 CELL_Glc13_25mM_Normo
CELL_Glc13_25mM_Normo_14 CELL_Glc13_25mM_Normo
CELL_Glc13_25mM_Normo_15 CELL_Glc13_25mM_Normo
This can be done by removing the value at the end of the sample names. Or you could even create a new phenoData(xset3)$class
value on your own. As long as it has the following structure (Example as given above):
geoRge will evaluate the structure of the string by checking the position of the separator with sep.pos
value, and then save the values for the PuInc analysis.
Ok, I am one step further:
> phenoData(xset3)
class
CELL_Glc12_05mM_Normo_04 CELL_Glc12_05mM_Normo
CELL_Glc12_05mM_Normo_05 CELL_Glc12_05mM_Normo
...
CELL_Glc13_25mM_Normo_15 CELL_Glc13_25mM_Normo
> s1 <- PuInc_seeker(XCMSet=xset3,ULtag="CELL_Glc12",Ltag="CELL_Glc13",sep.pos="f")
Error in `[.data.frame`(D1, , filtsampsint) : undefined columns selected
I think that is because filtsampsint contains NA values. I changed
the subselection now to filtsampsint <- na.omit(filtsamps[apply(meanintensities, 2, function(x) all(x < PuInc.int.lim))])
and successfully got the s1 object. You'll get a pull request with that change soon.
The next issue I get is
> s2 <- basepeak_finder(PuIncR=s1,XCMSet=xset3,ULtag="CELL_Glc12",Ltag="CELL_Glc13",
+ sep.pos="f",UL.atomM=12.0,L.atomM=13.003355,
+ ppm.s=6.5,Basepeak.minInt=2000)
Error in if ((mi12 > Basepeak.minInt)) { :
missing value where TRUE/FALSE needed
The s1 object is
> str(s1)
List of 4
$ PuInc : num [1:40, 1:4] 112 112 129 130 141 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:40] "1" "2" "3" "4" ...
.. ..$ : chr [1:4] "mzmed" "rtmed" "rtmin" "rtmax"
$ PuInc_conditions: Named chr [1:40] "25mM_Normo" "25mM_Normo" "05mM_Normo" "05mM_Normo" ...
..- attr(*, "names")= chr [1:40] "105" "106" "429" "447" ...
$ pvalue :'data.frame': 1297 obs. of 2 variables:
..$ 05mM_Normo: num [1:1297] 0.82 0.274 0.591 0.22 0.596 ...
..$ 25mM_Normo: num [1:1297] 0.936 0.662 0.453 0.305 0.847 ...
$ foldchange :'data.frame': 1297 obs. of 2 variables:
..$ 05mM_Normo: num [1:1297] 1.07 1.16 1.29 1.22 -1.03 ...
..$ 25mM_Normo: num [1:1297] -1.01 -1.06 1.04 -1.28 1.03 ...
With the following traceback:
8 FUN(X[[i]], ...)
7 lapply(X = X, FUN = FUN, ...)
6 sapply(cond, USE.NAMES = F, simplify = T, function(x) {
mi12 <- mi[grep(ULtag, names(mi))]
mi12 <- mi12[grep(x, names(mi12))]
if ((mi12 > Basepeak.minInt)) { ...
5 sapply(cond, USE.NAMES = F, simplify = T, function(x) {
mi12 <- mi[grep(ULtag, names(mi))]
mi12 <- mi12[grep(x, names(mi12))]
if ((mi12 > Basepeak.minInt)) { ... at george.R#194
4 FUN(X[[i]], ...)
3 lapply(rownames(res_inc), function(y) {
isot <- sapply(1:max_atoms(res_inc[y, "mzmed"], L.atomM),
function(x) {
res_inc[y, "mzmed"] - (x * mass_diff) ...
2 lapply(rownames(res_inc), function(y) {
isot <- sapply(1:max_atoms(res_inc[y, "mzmed"], L.atomM),
function(x) {
res_inc[y, "mzmed"] - (x * mass_diff) ... at george.R#149
1 basepeak_finder(PuIncR = s1, XCMSet = xset3, ULtag = "CELL_Glc12",
Ltag = "CELL_Glc13", sep.pos = "f", UL.atomM = 12, L.atomM = 13.003355,
ppm.s = 6.5, Basepeak.minInt = 2000)
Any ideas ? Yours, Steffen
Debugging a bit further I find that mi12 is NULL, and mi is (NA, NA, 1267, 936). I am running R-3.2.3, do you have a version with different NA handling behaviour ?
Dear Steffen,
Thank you for that. I think that might be it, when calculating the mi
object (mean intensity) the mean is not ready to deal with NA values, might as well use the na.rm=T
for that.
I also use version 3 from R (R-3.1.0). So we should not find version compatibility issues (hopefully). Edit: I assume you are not using fillPeaks() when running XCMS, that is the most probable source of NA values.
Still debugging here. Just to be sure, you don't fillPeaks() the xsets ?
On the contrary, we do use fillPeaks() when running XCMS.
Edit:
I just checked "Example.R" and saw that the xset3 <- fillPeaks(xset3)
was missing. I just added it.
Sorry for the inconvenience.
Ok, that was it. fillPeaks() was missing from Example.R Thanks for your patience, yours, Steffen
Hi, I am trying to run the example script with the MTBLS213 data, and get the following error in PuInc_seeker(). Very often this is due to a missing DROP=FALSE in some access, so that a matrix/dataframe gets "helpfully" changed from 2D to just a vector by R.
Yours, Steffen