lgatto / MSnbase

Base Classes and Functions for Mass Spectrometry and Proteomics
http://lgatto.github.io/MSnbase/
126 stars 46 forks source link

Randomly failing unit tests #138

Closed jorainer closed 8 years ago

jorainer commented 8 years ago

@lgatto are you also getting failing unit tests e.g. in _testequalMSnExp.R? What really puzzles me is that I don't get any errors when I run this file selectively with test_file("test_equalMSnExps.R") or if I run all tests using test_dir("."), but I get errors when I use source("testthat.R"):

1. Error: Equality function (@test_equalMSnExps.R#12) --------------------------
no function to return from, jumping to top level
1: expect_true(all.equal(inmem1, ondisk1)) at testthat/test_equalMSnExps.R:12
2: expect(identical(as.vector(object), TRUE), sprintf("%s isn't true.", lab), info = info)
3: as.expectation(exp, ..., srcref = srcref)
4: identical(as.vector(object), TRUE)
5: as.vector(object)
6: all.equal(inmem1, ondisk1)
7: all.equal(inmem1, ondisk1)
8: equalMSnExps(target, current, ...)
9: spectra(ondisk)
10: spectra(ondisk)
11: .local(object, ...)
12: spectrapply(object, BPPARAM = BPPARAM)
13: bplapply(fDataPerFile, FUN = .applyFun2SpectraOfFileMulti, filenames = fNames, queue = theQ, 
       APPLYFUN = FUN, BPPARAM = BPPARAM, ...)
14: bplapply(fDataPerFile, FUN = .applyFun2SpectraOfFileMulti, filenames = fNames, queue = theQ, 
       APPLYFUN = FUN, BPPARAM = BPPARAM, ...)
15: lapply(X, FUN, ...)
16: FUN(X[[i]], ...)
17: withCallingHandlers({
       tryCatch({
           FUN(...)
       }, error = handle_error)
   }, warning = handle_warning)
18: tryCatch({
       FUN(...)
   }, error = handle_error)
19: tryCatchList(expr, classes, parentenv, handlers)
20: tryCatchOne(expr, names, parentenv, handlers[[1L]])
21: doTryCatch(return(expr), name, parentenv, handler)
22: FUN(...)
23: lengths(allSpect)
24: (function (classes, fdef, mtable) 
   {
       methods <- .findInheritedMethods(classes, fdef, mtable)
       if (length(methods) == 1L) 
           return(methods[[1L]])
       else if (length(methods) == 0L) {
           cnames <- paste0("\"", vapply(classes, as.character, ""), "\"", collapse = ", ")
           stop(gettextf("unable to find an inherited method for function %s for signature %s", 
               sQuote(fdef@generic), sQuote(cnames)), domain = NA)
       }
       else stop("Internal error in finding inherited methods; didn't return a unique method", 
           domain = NA)
   })(list("list"), structure(function (x, use.names = TRUE) 
   standardGeneric("lengths"), generic = structure("lengths", package = "BiocGenerics"), package = "BiocGenerics", group = list(), valueClass = character(0), signature = "x", default = structure(function (x, 
       use.names = TRUE) 
   .Internal(lengths(x, use.names)), target = structure("ANY", class = structure("signature", package = "methods"), .Names = "x", package = "methods"), defined = structure("ANY", class = structure("signature", package = "methods"), .Names = "x", package = "methods"), generic = structure("lengths", package = "base"), class = structure("derivedDefaultMethod", package = "methods")), skeleton = (structure(function (x, 
       use.names = TRUE) 
   .Internal(lengths(x, use.names)), target = structure("ANY", class = structure("signature", package = "methods"), .Names = "x", package = "methods"), defined = structure("ANY", class = structure("signature", package = "methods"), .Names = "x", package = "methods"), generic = structure("lengths", package = "base"), class = structure("derivedDefaultMethod", package = "methods")))(x, 
       use.names), class = structure("standardGeneric", package = "methods")), <environment>)
25: .findInheritedMethods(classes, fdef, mtable)
26: .newSignature(classes, fdef@signature)
27: packageSlot(classes)
28: (function (x) 
   x$.self$finalize())(<environment>)

I'll investigate further, but eventually you have already a solution.

jorainer commented 8 years ago

Got rid of many errors, but this one, occurring randomly, drives me nuts:

attempt to apply non-function
1: readMSData(f, verbose = FALSE, centroided = FALSE) at testthat/test_OnDiskMSnExp2.R:26
2: new("Spectrum2", scanIndex = as.integer(hd$seqNum), merged = as.numeric(hd$mergedScan),.
       polarity = hd$polarity, precScanNum = as.integer(scanNums[i]), precursorMz = hd$precursorMZ,.
       precursorIntensity = hd$precursorIntensity, precursorCharge = hd$precursorCharge,.
       collisionEnergy = hd$collisionEnergy, rt = hd$retentionTime, acquisitionNum = hd$acquisitionNum,.
       mz = .p[, 1], tic = hd$totIonCurrent, intensity = .p[, 2], fromFile = filen,.
       centroided = centroided)
3: initialize(value, ...)
4: initialize(value, ...)
5: .local(.Object, ...)
6: callNextMethod(.Object, ..., intensity = intensity, mz = mz, peaksCount = length(mz))
7: eval(call, callEnv)
8: eval(expr, envir, enclos)
9: .nextMethod(.Object, ..., intensity = intensity, mz = mz, peaksCount = length(mz))
10: .local(.Object, ...)
11: callNextMethod(.Object, ...)
12: addNextMethod(method, f, envir = methodEnv)
13: addNextMethod(method, f, envir = methodEnv)
14: .findNextFromTable(method, f, optional, envir)
15: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
16: is(m, "MethodDefinition")
17: (function (x).
   x$.self$finalize())(<environment>)
jorainer commented 8 years ago

Could fix most of the errors (https://github.com/lgatto/MSnbase/commit/0675639e9d4b5887e0ec9788df19047816a2535b). Specifically, the first described above seems to have gone after explicitly calling base::lengths( in .applyFun2SpectraOfFileMulti. The second from above is still not fixed.

jorainer commented 8 years ago

Seems to be fixed now. I'm guessing that the callNextMethod with ... and additional arguments caused the problem.

lgatto commented 8 years ago

I'm also getting crazy with similar issues now...

lgatto commented 8 years ago

Closing now, as I get it to check fine. Fingers' crossed!

lgatto commented 8 years ago

In 29309c8d10d52d8d95656c548d7148b6cf1e5574, I renames arguments centroided, msLevel and smoothed were renamed with a . at the end (as in some methods) in a desperate attempt to fix some the these random attempt to apply non-function failing tests.

First test after that commit and BANG,

Failed -------------------------------------------------------------------------
1. Error: Coercion to MSnExp (@test_OnDiskMSnExp.R#48) -------------------------
attempt to apply non-function
1: readMSData(f, msLevel = 2, centroided = NA, verbose = FALSE) at /home/lg390/R/x86_64-pc-linux-gnu-library/3.3/MSnbase/tests/testthat/test_OnDiskMSnExp.R:48
2: new("Spectrum2", scanIndex = as.integer(hd$seqNum), merged = as.numeric(hd$mergedScan), 
       polarity = hd$polarity, precScanNum = as.integer(scanNums[i]), precursorMz = hd$precursorMZ, 
       precursorIntensity = hd$precursorIntensity, precursorCharge = hd$precursorCharge, 
       collisionEnergy = hd$collisionEnergy, rt = hd$retentionTime, acquisitionNum = hd$acquisitionNum, 
       mz = .p[, 1], tic = hd$totIonCurrent, intensity = .p[, 2], fromFile = filen, 
       centroided = centroided.)
3: initialize(value, ...)
4: initialize(value, ...)
5: .local(.Object, ...)
6: callNextMethod(.Object, ...)
7: eval(call, callEnv)
8: eval(expr, envir, enclos)
9: .nextMethod(.Object, ...)
10: .local(.Object, ...)
11: callNextMethod(.Object, ...)
12: addNextMethod(method, f, envir = methodEnv)
13: addNextMethod(method, f, envir = methodEnv)
14: .findNextFromTable(method, f, optional, envir)
15: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
16: .newSignature(classes, fdef@signature)
17: (function (x) 
   x$.self$finalize())(<environment>)

2. Error: Coercion to MSnExp (@test_OnDiskMSnExp.R#48) -------------------------
attempt to apply non-function
1: readMSData(f, msLevel = 2, centroided = NA, verbose = FALSE) at /home/lg390/R/x86_64-pc-linux-gnu-library/3.3/MSnbase/tests/testthat/test_OnDiskMSnExp.R:48
2: new("Spectrum2", scanIndex = as.integer(hd$seqNum), merged = as.numeric(hd$mergedScan), 
       polarity = hd$polarity, precScanNum = as.integer(scanNums[i]), precursorMz = hd$precursorMZ, 
       precursorIntensity = hd$precursorIntensity, precursorCharge = hd$precursorCharge, 
       collisionEnergy = hd$collisionEnergy, rt = hd$retentionTime, acquisitionNum = hd$acquisitionNum, 
       mz = .p[, 1], tic = hd$totIonCurrent, intensity = .p[, 2], fromFile = filen, 
       centroided = centroided.)
3: initialize(value, ...)
4: initialize(value, ...)
5: .local(.Object, ...)
6: callNextMethod(.Object, ...)
7: eval(call, callEnv)
8: eval(expr, envir, enclos)
9: .nextMethod(.Object, ...)
10: .local(.Object, ...)
11: callNextMethod(.Object, ...)
12: addNextMethod(method, f, envir = methodEnv)
13: addNextMethod(method, f, envir = methodEnv)
14: .findNextFromTable(method, f, optional, envir)
15: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
16: .newSignature(classes, fdef@signature)
17: (function (x) 
   x$.self$finalize())(<environment>)

so that's clearly not the issue.

lgatto commented 8 years ago

It looks like these failure happen at construction of spectra. I will start by reviewing R-level construction. @jotsetung, if you have time, could you look at the C-level constructors.

You changed a lengths to base::lengths and I also update a split to base::split (7351b4e126eaa9edb741e4b4b4e940030b403180 and e0bc85f9b974bc3f2a24253b3171b21401edc736), sort/base::sort (dc3b72bbd6dabfb47a38c60d0b13151ed4862992), match/base::match (bbb95810208da8191b9271c053b4243562b80023) and order/base::order (3f69b4ee42a514a52aab0c5a2a87802a97f33ff5). We should look out for other such cases.

lgatto commented 8 years ago

By the way, here's my session information

> sessionInfo()
R version 3.3.1 Patched (2016-07-14 r70916)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] testthat_1.0.2      MSnbase_1.21.8      ProtGenerics_1.5.1 
[4] BiocParallel_1.7.4  mzR_2.7.3           Rcpp_0.12.6        
[7] Biobase_2.33.0      BiocGenerics_0.19.2

loaded via a namespace (and not attached):
 [1] msdata_0.12.0         compiler_3.3.1        BiocInstaller_1.23.6 
 [4] plyr_1.8.4            iterators_1.0.8       tools_3.3.1          
 [7] zlibbioc_1.19.0       MALDIquant_1.15       digest_0.6.9         
[10] preprocessCore_1.35.0 gtable_0.2.0          lattice_0.20-33      
[13] foreach_1.4.3         stringr_1.0.0         S4Vectors_0.11.9     
[16] IRanges_2.7.11        stats4_3.3.1          grid_3.3.1           
[19] impute_1.47.0         R6_2.1.2              XML_3.98-1.4         
[22] limma_3.29.14         ggplot2_2.1.0         reshape2_1.4.1       
[25] magrittr_1.5          scales_0.4.0          pcaMethods_1.65.0    
[28] codetools_0.2-14      mzID_1.11.2           colorspace_1.2-6     
[31] stringi_1.1.1         affy_1.51.0           munsell_0.4.3        
[34] doParallel_1.0.10     crayon_1.3.2          vsn_3.41.0           
[37] affyio_1.43.0        
jorainer commented 8 years ago

I always get something in readMSData but not in readMSData2 and also not with the C-level constructors.

3. Error: Coercion to MSnExp (@test_OnDiskMSnExp.R#29) -------------------------
attempt to apply non-function
1: readMSData(f, msLevel = 2, centroided = NA, verbose = FALSE) at testthat/test_OnDiskMSnExp.R:29
2: new("Spectrum2", scanIndex = as.integer(hd$seqNum), merged = as.numeric(hd$mergedScan),.
       polarity = hd$polarity, precScanNum = as.integer(scanNums[i]), precursorMz = hd$precursorMZ,.
       precursorIntensity = hd$precursorIntensity, precursorCharge = hd$precursorCharge,.
       collisionEnergy = hd$collisionEnergy, rt = hd$retentionTime, acquisitionNum = hd$acquisitionNum,.
       mz = .p[, 1], tic = hd$totIonCurrent, intensity = .p[, 2], fromFile = filen,.
       centroided = centroided.)
3: initialize(value, ...)
4: initialize(value, ...)
5: .local(.Object, ...)
6: callNextMethod(.Object, ...)
7: eval(call, callEnv)
8: eval(expr, envir, enclos)
9: .nextMethod(.Object, ...)
10: .local(.Object, ...)
11: callNextMethod(.Object, ...)
12: addNextMethod(method, f, envir = methodEnv)
13: addNextMethod(method, f, envir = methodEnv)
14: .findNextFromTable(method, f, optional, envir)
15: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
16: .inheritedArgsExpression(m@target, m@defined, body(m))
17: is(ei, "SClassExtension")
18: possibleExtends(cl, class2, class1Def, class2Def)
19: (function (x).
   x$.self$finalize())(<environment>)

And my sessionInfo:

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin15.6.0/x86_64 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] pRolocdata_1.11.0   MSnbase_1.21.8      ProtGenerics_1.5.1 
[4] BiocParallel_1.7.5  mzR_2.7.3           Rcpp_0.12.6        
[7] Biobase_2.33.0      BiocGenerics_0.19.2 testthat_1.0.2.9000

loaded via a namespace (and not attached):
 [1] BiocInstaller_1.23.6  plyr_1.8.4            iterators_1.0.8      
 [4] tools_3.3.1           zlibbioc_1.19.0       digest_0.6.9         
 [7] MALDIquant_1.15       preprocessCore_1.35.0 gtable_0.2.0         
[10] lattice_0.20-33       foreach_1.4.3         stringr_1.0.0        
[13] S4Vectors_0.11.10     IRanges_2.7.12        stats4_3.3.1         
[16] grid_3.3.1            impute_1.47.0         R6_2.1.2             
[19] XML_3.98-1.4          limma_3.29.17         ggplot2_2.1.0        
[22] reshape2_1.4.1        magrittr_1.5          scales_0.4.0         
[25] pcaMethods_1.65.0     codetools_0.2-14      mzID_1.11.2          
[28] colorspace_1.2-6      stringi_1.1.1         affy_1.51.0          
[31] munsell_0.4.3         doParallel_1.0.10     vsn_3.41.0           
[34] crayon_1.3.2          affyio_1.43.0        
lgatto commented 8 years ago

I always get something in readMSData but not in readMSData2 and also not with the C-level constructors.

I have seen issues in readMSData2, which is very confusing. No way to pin it down.

I am starting to wonder if it's not related to running test_check/test_package; I have never had these issues when executing the test code directly.

lgatto commented 8 years ago

I have added tests/testthat/fails.md file in the MSnbase2 branch, to document more systematically the errors and try to identify a pattern.

lgatto commented 8 years ago

Actually, I have been able to reproduce some random failings interactively. The code below is from test_MSnExpFilters.R, run interactively:

f <- msdata::proteomics(full.names = TRUE, pattern = "TMT_Erwinia_1")
inmem2 <- readMSData(f, centroided = NA, verbose = FALSE)  ## That's the MS 2 data.
inmem1 <- readMSData(f, centroided = NA, verbose = FALSE, msLevel = 1)  ## MS 1 data.
ondisk <- readMSData2(f, verbose = FALSE)
ondisk1 <- readMSData2(f, msLevel = 1, verbose = FALSE)
ondisk2 <- readMSData2(f, msLevel = 2, verbose = FALSE)

test_that("filterMsLevel", {
    expect_true(all.equal(inmem2, filterMsLevel(inmem2, msLevel. = 2)))
    expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
    ## Compare in-mem MS2 with filterMsLevel of on-disk MS 1 and 2.
    expect_true(all.equal(inmem2, filterMsLevel(ondisk, msLevel. = 2)))
    ## Compare in-mem MS1 with filterMsLevel of on-disk MS 1 and 2
    expect_true(all.equal(inmem1, filterMsLevel(ondisk, msLevel. = 1)))
    ## in-mem MS 2 with on-disk MS 2
    expect_true(all.equal(inmem2, ondisk2))
    expect_true(all.equal(filterMsLevel(inmem2, msLevel. = 2),
                          filterMsLevel(ondisk2, msLevel. = 2)))
    expect_identical(length(filterMsLevel(inmem2, 1)), 0L)
    expect_identical(length(filterMsLevel(ondisk2, 1)), 0L)
    expect_true(all.equal(filterMsLevel(ondisk, 2), ondisk2))
    expect_identical(length(filterMsLevel(ondisk, 1)),
                     sum(fData(ondisk)$msLevel == 1))
})

This is from the console - first line is after C-c C-c the filterMsLevel test. After, it's C-c C-n individual lines:

+ . + Error: Test failed: 'filterMsLevel'
* all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)) isn't true.
>     expect_true(all.equal(inmem2, filterMsLevel(inmem2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(inmem2, filterMsLevel(ondisk, msLevel. = 2)))
>     expect_true(all.equal(inmem1, filterMsLevel(ondisk, msLevel. = 1)))
>     expect_true(all.equal(inmem2, ondisk2))
>     expect_true(all.equal(filterMsLevel(inmem2, msLevel. = 2),
+                           filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_identical(length(filterMsLevel(inmem2, 1)), 0L)
>     expect_identical(length(filterMsLevel(ondisk2, 1)), 0L)
>     expect_true(all.equal(filterMsLevel(ondisk, 2), ondisk2))
>     expect_identical(length(filterMsLevel(ondisk, 1)),
+                      sum(fData(ondisk)$msLevel == 1))
> 
+ . + > 
+ . + >     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
Error: all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)) isn't true.
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
Error: all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)) isn't true.
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
>     expect_true(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
Error: all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)) isn't true.

and

> table(replicate(20, isTRUE(all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))))

FALSE  TRUE 
    2    18 

FALSE equates to "Spectra at position 173 are different." (with different positions). This might relate to issues in Spectrum2 construction.

lgatto commented 8 years ago

To further investigate this, I have modified equalMSnExps for it to run browser() when 2 identical spectra are different, and I got this:

> tmp = replicate(10, all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))
Spectra at position 382 are different.
Called from: equalMSnExps(target, current, ...)
Browse[1]> 
debug at /tmp/methods-all.equal.R!2746GHI#10: if (!isTRUE(speq <- all.equal(ondisk[[i]], inmem[[i]], ...))) {
    cat(paste("Spectra at position", i, "are different.\n"))
    browser()
}
Browse[2]> i
[1] 383
Browse[2]> all.equal(inmem[[i]], ondisk[[i]])
[1] TRUE
Browse[2]> all.equal(inmem[[i-1]], ondisk[[i-1]])
 [1] "Attributes: < Component “collisionEnergy”: Mean relative difference: 1 >"      
 [2] "Attributes: < Component “fromFile”: Modes: numeric, logical >"                 
 [3] "Attributes: < Component “fromFile”: target is numeric, current is logical >"   
 [4] "Attributes: < Component “polarity”: Mean relative difference: 1 >"             
 [5] "Attributes: < Component “precScanNum”: Modes: numeric, logical >"              
 [6] "Attributes: < Component “precScanNum”: target is numeric, current is logical >"
 [7] "Attributes: < Component “precursorCharge”: Mean relative difference: 1 >"      
 [8] "Attributes: < Component “precursorIntensity”: Mean relative difference: 1 >"   
 [9] "Attributes: < Component “precursorMz”: Modes: numeric, logical >"              
[10] "Attributes: < Component “precursorMz”: target is numeric, current is logical >"
[11] "Attributes: < Component “tic”: Modes: numeric, logical >"                      
[12] "Attributes: < Component “tic”: target is numeric, current is logical >"        
Browse[2]> all.equal(inmem[[i-2]], ondisk[[i-2]])
[1] TRUE
Browse[2]> inmem[[i-1]]
Object of class "Spectrum2"
 Precursor: 488.8127 
 Retention time: 21:35 
 Charge: 2 
 MSn level: 2 
 Peaks count: 199 
 Total ion count: 87771420 
Browse[2]> ondisk[[i-1]]
Object of class "Spectrum2"
 Precursor: FALSE 
 Retention time: 21:35 
 Charge: 0 
 MSn level: 2 
 Peaks count: 199 
 Total ion count: 87771420 
Browse[2]> ondisk[[i-1]]@fromFile
[1] FALSE
Browse[2]> inmem[[i-1]]@fromFile
[1] 1
Browse[2]> inmem[[i-1]]@tic
[1] 87829984
Browse[2]> ondisk[[i-1]]@tic
[1] FALSE

(it is the i-1th spectra that are different, rather thani) It looks like the ondisk spectrum is actually not even valid

Browse[2]> validObject(ondisk[[i-1]])
Error in validObject(ondisk[[i - 1]]) (from methods-all.equal.R!2746GHI#10) : 
  invalid class “Spectrum2” object: 1: invalid object for slot "precScanNum" in class "Spectrum2": got class "logical", should be or extend class "integer"
invalid class “Spectrum2” object: 2: invalid object for slot "precursorMz" in class "Spectrum2": got class "logical", should be or extend class "numeric"
invalid class “Spectrum2” object: 3: invalid object for slot "tic" in class "Spectrum2": got class "logical", should be or extend class "numeric"
invalid class “Spectrum2” object: 4: invalid object for slot "fromFile" in class "Spectrum2": got class "logical", should be or extend class "integer"
Browse[2]> validObject(inmem[[i-1]])
[1] TRUE

I am not sure if this is the reason for the random fails and the attempt to apply non-function error, but is surely one reason why some test fail. @jotsetung, could you have a look?

jorainer commented 8 years ago

Man that's strange. I can not replicate the above errors. On my system all runs smoothly.

The problem to spectrum validity could be related to some strange casting problem in the C-code. I'll investigate.

lgatto commented 8 years ago

It is not systematic; I had to run it as

replicate(10, all.equal(ondisk2, filterMsLevel(ondisk2, msLevel. = 2)))

with


## The function below work for in memory and on disk MSnExp instances
## and for spectra lists.
equalMSnExps <- function(inmem, ondisk, ...) {
    if ((l <- length(inmem)) != length(ondisk))
        return("Objects have different lengths.")
    if (inherits(ondisk, "OnDiskMSnExp"))
        ondisk <- spectra(ondisk)
    if (inherits(inmem, "OnDiskMSnExp"))
        inmem <- spectra(inmem)
    ## otherwise, expected to be a list of spectra
    for (i in seq_len(l)) {
        if (!isTRUE(speq <- all.equal(ondisk[[i]], inmem[[i]], ...))) {
            ## returnf(paste("Spectra at position", i, "are different."))
            cat(paste("Spectra at position", i, "are different.\n"))
            browser()
        }
    }
    return(TRUE)
}

to get a chance to actually get these results. It happens for different spectra and with different errors.

jorainer commented 8 years ago

I run the test now several hundred times for MS1 spectra without any problems, while it occurs randomly for Spectrum2. I think we can nail it down to the Spectrum constructor. I'll check single and multi-spectrum C-constructors. One potential suspect that I have is the garbage collector that might mess up with the objects (although why only for Spectrum2 and not for Spectrum1?).

lgatto commented 8 years ago

Also, are the PROTECT and UNPROTECTED calls correct? That would probably rather end up in a segfault, but still, as likely culprit.

jorainer commented 8 years ago

Actually i am PROTECTing all SEXP objects I'm allocating, so that should be OK. I assume the plain C-variables don't have to be protected.

The C-constructor creating a single spectrum doesn't cause any errors (neither for Spectrum1 nor for Spectrum2). The C-constructor creating a list of Spectrum1 objects in one call doesn't throw any errors either. The problem is the C-constructor creating a list of Spectrum2 objects, I got 4 times an error in 100 runs. I'll have a closer look at the corresponding C-function, but I don't understand here why for Spectrum1 all works smoothly.

lgatto commented 8 years ago

Could it be something with the recently added smooted slot?


commit 8659768f98ec7e9929f08aae988d248d5e0cc2b9
Author: jotsetung <johannes.rainer@gmail.com>
Date:   Wed Jul 27 06:46:17 2016 +0200

    Add M/Z sorting in C-level Spectrum1 constructor

    o New Spectrum1_mz_sorted and Spectra1_mz_sorted functions that use
      C-level constructors that enforce ordering of M/Z-intensity pairs by
      M/Z.
    o Unit test for new constructors (in test_Spectrum_Constructor.R).
    o Cleanup of C-code.
    o Benchmarks for comparison of M/Z sorting in R and in the
      C-constructors (see issue #136 for results).
    o Add smoothed argument to Spectrum1 constructor.

commit 755db37c8f86e206ff122e7fb6f800de4f436ae8
Author: Laurent <lg390@cam.ac.uk>
Date:   Sat Jul 16 11:02:13 2016 +0100

    fix bug in Spectra2 that messed up centroided and smoothed

commit f51de012085e606a70fedb20c39573cc229b2bec
Author: Laurent <lg390@cam.ac.uk>
Date:   Wed Jul 13 15:51:59 2016 +0100

    added smoothed slot to C-level - closes issue #128

commit 6b88da803cddba0cf292c87a395783b5befc8bb4
Author: Laurent <lg390@cam.ac.uk>
Date:   Fri Jul 8 15:07:11 2016 +0100

    New smoothed slot/accessor/replacement methods
jorainer commented 8 years ago

Shouldn't be the case; this slot was also added to the Spectrum1 constructor. I will compare both C-functions closer; eventually something escaped me.

jorainer commented 8 years ago

Got it, finally. Wasn't aware that ScalarLogical, ScalarInteger etc calls in C create an R-object (now I am, thanks to http://adv-r.had.co.nz/C-interface.html). With lots of more PROTECT calls the Spectrum2 objects survive also eventual garbage collections.

lgatto commented 8 years ago

In addition:

> options(error = recover)
> tmp <- replicate(100, readMSData(f))
Error in (function (x)  : attempt to apply non-function

Enter a frame number, or 0 to exit   

 1: replicate(100, readMSData(f))
 2: sapply(integer(n), eval.parent(substitute(function(...) expr)), simplify = 
 3: lapply(X = X, FUN = FUN, ...)
 4: FUN(X[[i]], ...)
 5: readMSData(f)
 6: new("Spectrum2", scanIndex = as.integer(hd$seqNum), merged = as.numeric(hd$
 7: initialize(value, ...)
 8: initialize(value, ...)
 9: .local(.Object, ...)
10: callNextMethod(.Object, ...)
11: eval(call, callEnv)
12: eval(expr, envir, enclos)
13: .nextMethod(.Object, ...)
14: .local(.Object, ...)
15: callNextMethod(.Object, ...)
16: addNextMethod(method, f, envir = methodEnv)
17: addNextMethod(method, f, envir = methodEnv)
18: .findNextFromTable(method, f, optional, envir)
19: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
20: mget(labels[found], table)
21: (function (x) 
x$.self$finalize())(<environment>)

Selection: 

I am looking into this.

lgatto commented 8 years ago

Unfortunately, I still seem to see problems running a similar test as before. I use a modified equalMSnExps that calls browser() when spectra are different

> MSnbase:::equalMSnExps
function(inmem, ondisk, ...) {
    if ((l <- length(inmem)) != length(ondisk))
        return("Objects have different lengths.")
    if (inherits(ondisk, "OnDiskMSnExp"))
        ondisk <- spectra(ondisk)
    if (inherits(inmem, "OnDiskMSnExp"))
        inmem <- spectra(inmem)
    ## otherwise, expected to be a list of spectra
    for (i in seq_len(l)) {
        if (!isTRUE(speq <- all.equal(ondisk[[i]], inmem[[i]], ...))){
            browser()
            return(paste("Spectra at position", i, "are different."))
        }
    }
    return(TRUE)
}
<environment: namespace:MSnbase>

And then I run the unit test that failed:

> test_dir("~/dev/00_github/MSnbase/tests/testthat/", filter = "MSnExpF")
MSnExp filter functions: ..........................................Called from: equalMSnExps(sp1, sp2, ...)
Browse[1]> debug at /tmp/methods-all.equal.R!14871nwX#12: return(paste("Spectra at position", i, "are different."))
Browse[2]> ls()
[1] "i"      "inmem"  "l"      "ondisk" "speq"  
Browse[2]> ondisk[[i]]
Object of class "Spectrum1"
 Retention time: 18:28 
 MSn level: 1 
 Total ion count: 4764 
 Polarity: -1 
Browse[2]> inmem[[i]]
Object of class "Spectrum1"
 Retention time: 18:28 
 MSn level: 1 
 Total ion count: 43527 
 Polarity: -1 
Browse[2]> all.equal(inmem[[i]], ondisk[[i]])
[1] "Attributes: < Component “intensity”: Numeric: lengths (43527, 4764) differ >"
[2] "Attributes: < Component “mz”: Numeric: lengths (43527, 4764) differ >"       
[3] "Attributes: < Component “peaksCount”: Mean relative difference: 0.8905507 >" 
Browse[2]> all.equal(inmem[[i-1]], ondisk[[i-1]])
Error in inmem[[i - 1]] (from methods-all.equal.R!14871nwX#12) : 
  attempt to select less than one element in get1index <real>
Browse[2]> i
[1] 1
Browse[2]> all.equal(inmem[[i+1]], ondisk[[i+1]])
[1] "Attributes: < Component “intensity”: Numeric: lengths (2300, 104) differ >" 
[2] "Attributes: < Component “mz”: Numeric: lengths (2300, 104) differ >"        
[3] "Attributes: < Component “peaksCount”: Mean relative difference: 0.9547826 >"
Browse[2]> validObject(inmem[[i]])
[1] TRUE
Browse[2]> validObject(ondisk[[i]])
[1] TRUE
Browse[2]> str(ondisk[[i]])
Formal class 'Spectrum1' [package "MSnbase"] with 13 slots
  ..@ msLevel          : int 1
  ..@ peaksCount       : int 4764
  ..@ rt               : num 1108
  ..@ acquisitionNum   : int 1003
  ..@ scanIndex        : int 1
  ..@ tic              : num 1.38e+08
  ..@ mz               : num [1:4764] 500 500 500 500 500 ...
  ..@ intensity        : num [1:4764] 0 0 0 0 1433 ...
  ..@ fromFile         : int 1
  ..@ centroided       : logi NA
  ..@ smoothed         : logi NA
  ..@ polarity         : int -1
  ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. ..@ .Data:List of 2
  .. .. .. ..$ : int [1:3] 0 4 0
  .. .. .. ..$ : int [1:3] 0 2 0
Browse[2]> str(inmem[[i]])
Formal class 'Spectrum1' [package "MSnbase"] with 13 slots
  ..@ msLevel          : int 1
  ..@ peaksCount       : int 43527
  ..@ rt               : num 1108
  ..@ acquisitionNum   : int 1003
  ..@ scanIndex        : int 1
  ..@ tic              : num 1.38e+08
  ..@ mz               : num [1:43527] 400 400 400 400 400 ...
  ..@ intensity        : num [1:43527] 0 0 0 0 0 ...
  ..@ fromFile         : int 1
  ..@ centroided       : logi NA
  ..@ smoothed         : logi NA
  ..@ polarity         : int -1
  ..@ .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
  .. .. ..@ .Data:List of 2
  .. .. .. ..$ : int [1:3] 0 4 0
  .. .. .. ..$ : int [1:3] 0 2 0

To check which one is correct, I open the file again, and check the first spectrum

Browse[2]> tmp = readMSData2(msdata::proteomics(full = T), msLevel. = 1L)
Browse[2]> tmp[[1]]
Object of class "Spectrum1"
 Retention time: 18:28 
 MSn level: 1 
 Total ion count: 43527 
 Polarity: -1 
Browse[2]> ondisk[[1]]
Object of class "Spectrum1"
 Retention time: 18:28 
 MSn level: 1 
 Total ion count: 4764 
 Polarity: -1 
Browse[2]> inmem[[1]]
Object of class "Spectrum1"
 Retention time: 18:28 
 MSn level: 1 
 Total ion count: 43527 
 Polarity: -1 
Browse[2]> 

@jotsetung - could it be that you need to do the same modifications to Spectrum1?

jorainer commented 8 years ago

I did also fix the Spectrum1 constructors. Also, I did 2000 iterations of spectra calls on OnDiskMSnExps containing either only MS1 or MS2 spectra and checked the validity of all spectra. Since I didn't get any more errors after the addition of multiple PROTECT calls I was pretty confident that the OnDiskMSnExp is OK now.

I'll repeat that with the comparison for MSnExp and OnDiskMSnExp; let's see where the bug is.

lgatto commented 8 years ago

Actually, this looks like an actual wrong unit test. Looking into it now.

lgatto commented 8 years ago

Actually, this looks like an actual wrong unit test. Looking into it now.

I confirm, the error above was a wrong test - will push now.

lgatto commented 8 years ago

Crossing my fingers now that the package checks cleanly!

lgatto commented 8 years ago

So, I have been checking the Spectrum initialisation method, fixed if with respect to #145, done additional fixes in readMSData and reverted some bits code as it was originally. I am now running thousands of iterations of readMSData (with an older version, to confirm that the problems did not exist before and the current b35f23f4f65c1d256ab45f739f38395c03e73499).

I have still observed random failures. Here's one

> tmp <- replicate(50, readMSData2(f, msLevel. = 2, verbose = FALSE))
Error in (function (x)  : attempt to apply non-function

Enter a frame number, or 0 to exit   

 1: replicate(50, readMSData2(f, msLevel. = 2, verbose = FALSE))
 2: sapply(integer(n), eval.parent(substitute(function(...) expr)), simplify = 
 3: lapply(X = X, FUN = FUN, ...)
 4: FUN(X[[i]], ...)
 5: readMSData2(f, msLevel. = 2, verbose = FALSE)
 6: filterMsLevel(res, msLevel.)
 7: filterMsLevel(res, msLevel.)
 8: .local(object, ...)
 9: object[msLevel(object) %in% msLevel.]
10: object[msLevel(object) %in% msLevel.]
11: .local(x, i, j, ..., drop)
12: `pData<-`(`*tmp*`, value = list(sampleNames = 1))
13: `pData<-`(`*tmp*`, value = list(sampleNames = 1))
14: initialize(object, data = value, varMetadata = varMetadata)
15: initialize(object, data = value, varMetadata = varMetadata)
16: .local(.Object, ...)
17: callNextMethod(.Object, data = data, varMetadata = varMetadata, ...)
18: eval(call, callEnv)
19: eval(expr, envir, enclos)
20: .nextMethod(.Object, data = data, varMetadata = varMetadata, ...)
21: .local(.Object, ...)
22: callNextMethod(.Object, ...)
23: addNextMethod(method, f, envir = methodEnv)
24: addNextMethod(method, f, envir = methodEnv)
25: .findNextFromTable(method, f, optional, envir)
26: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
27: .newSignature(classes, fdef@signature)
28: .asS4(structure(as.character(classes)[i], class = .signatureClassName, name
29: asS4(object, TRUE, 0)
30: structure(as.character(classes)[i], class = .signatureClassName, names = as
31: as.character(classes)
32: as.character.default(classes)
33: as.vector(x, "character")
34: (function (x) 
x$.self$finalize())(<environment>)

I am running more tests as above with readMSData. I haven't had the equivalent as above (just a matter of patience, I'm sure), but got (among others)

[...]
19: test_equalMSnExps.R#5: readMSData(f, msLevel = 1, verbose = FALSE, centroid
20: new("Spectrum1", rt = hd$retentionTime, acquisitionNum = as.integer(hd$acqu
21: initialize(value, ...)
22: initialize(value, ...)
23: .local(.Object, ...)
24: callNextMethod(.Object, ..., mz = mz[o], intensity = intensity[o], peaksCou
25: eval(call, callEnv)
26: eval(expr, envir, enclos)
27: .nextMethod(.Object, ..., mz = mz[o], intensity = intensity[o], peaksCount 
28: .local(.Object, ...)
29: callNextMethod(.Object, ...)
30: addNextMethod(method, f, envir = methodEnv)
31: addNextMethod(method, f, envir = methodEnv)
32: .findNextFromTable(method, f, optional, envir)
33: .findInheritedMethods(defined, fdef, mtable = NULL, excluded = excluded)
34: .newSignature(classes, fdef@signature)
35: .asS4(structure(as.character(classes)[i], class = .signatureClassName, name
36: asS4(object, TRUE, 0)
37: structure(as.character(classes)[i], class = .signatureClassName, names = as
38: as.character(classes)
39: as.character.default(classes)
40: as.vector(x, "character")
41: (function (x) 
x$.self$finalize())(<environment>)

when running source("testthat.R").

There is definitely progress (less failures), but it's not fixed yet and I am not sure where the actual problem is. I will continue and investigate readMSData and initialize,Spectrum.

@jotsetung - could you run additional tests with the latest commit.

jorainer commented 8 years ago

OK, I'll torture the functions too.

jorainer commented 8 years ago

Intermediate update: I'm using different files in my tests, the TMT_Erwinia, the two MS1-only files MM14.mzML and MM8.mzML from msdata and one of my (rather large) MS1-only files. So, in each iteration I call 3 times readMSData to read each of these test files/sets and call validObject on each. So far I got 1 attempt to apply non-function in 500 iterations. I'll try to do some other modifications in the constructor hoping to fix that too.

lgatto commented 8 years ago

I think there was a bug in readMSData2. There was

msdata <- mzR::openMSfile(f)
on.exit(mzR::close(msdata))

at the beginning of the files loop. There should be call to open at the beginning and close at the end of the loop, rather than close on exit of the function.

I don't know if this is really the reason for these issues, or if it was a problem at all.

lgatto commented 8 years ago

I'm using different files in my tests

Could it be related to a file? Could you try it for the different file sets one after the other?

lgatto commented 8 years ago

Also, it seems that repeating readMSData/readMSData2 is not necessarily the best way to get the error. Somehow, it pops up much more often when running source("testthat.R"), as if it was a more complex set of calls that triggered the problem.

jorainer commented 8 years ago

I wanted to run it on different files to see if it was file related; apparently it is not, as I got the error for each of them.

jorainer commented 8 years ago

I will dig again into the depths of the C-constructors; now I'm getting unexpected stuff there too; although the readMSData problems should be independent of that.

lgatto commented 8 years ago

now I'm getting unexpected stuff there too; although the readMSData problems should be independent of that.

That's what I was thinking too, but now, I start doubting everything :-(

jorainer commented 8 years ago

I can't believe it! Presently I get only one error:

1. Failure: bin_Spectrum - bug fix #ecaaa324505b17ee8c4855806f7e37f14f1b27b8 (@test_Spectrum.R#188) 

Will run some more check runs, and if I don't get any more of these other nasty errors I push my changes! I see light at the end of the tunnel!

jorainer commented 8 years ago

looks good (commit https://github.com/lgatto/MSnbase/commit/5439d5d0d2f78fdde71ea723cfafa4cce9788d8a, https://github.com/lgatto/MSnbase/commit/4627a6f263180011ab14e9e1bf70d54ae4c5c8b0) @lgatto can you please check?

jorainer commented 8 years ago

I did run some more tests, so far no errors!!! What I believe fixed the bugs:

lgatto commented 8 years ago

Thanks @jotsetung - could it boil down to a bug in R in the initialize method dispatch? I might do one more test next week, but let's proceed with your implementation for now. I've bumped the version in master to 1.99.0 to celebrate.

jorainer commented 8 years ago

@lgatto : yes, I think it has to do with the initialize method, I believe with the .Object, ..., mz, intensity. In the past I did from time to time get some strange errors with failing dispatch when the generic had ... and me implementing a method with ... and additional arguments. Could be a current problem in R (I hope); so in future I hope we can drop the C-level constructors in the initialize.

lgatto commented 8 years ago

Oh dear, @jotsetung

MSnExp filter functions: ..........................1Error in signalCondition(e) (from test_MSnExpFilters.R#88) : 
  no function to return from, jumping to top level
2Error in signalCondition(e) (from test_MSnExpFilters.R#88) : 
  no function to return from, jumping to top level
3Error in signalCondition(e) (from test_MSnExpFilters.R#88) : 
  no function to return from, jumping to top level
4Error in signalCondition(e) (from test_MSnExpFilters.R#88) : 
  no function to return from, jumping to top level
5Error in signalCondition(e) (from test_MSnExpFilters.R#88) : 
  no function to return from, jumping to top level

with 3d7fd478392fcfda9dafced11dff6976fb881f26

lgatto commented 8 years ago

Hopefully a false alarm.