HenrikBengtsson / affxparser

🔬 R package: This is the Bioconductor devel version of the affxparser package.
http://bioconductor.org/packages/devel/bioc/html/affxparser.html
7 stars 3 forks source link

convertCel() gives "Error in sprintf("GridCorner%s=%d %d\n" ... invalid format '%d' ...) #11

Closed HenrikBengtsson closed 9 years ago

HenrikBengtsson commented 9 years ago

Reproducible error reported by user:

> convertCel("GSM1060628_1B_ML_Cindy_9-8-10.CEL", "foo.cel")
Error in sprintf("GridCorner%s=%d %d\n", ff, aParams[[xkey]][1], aParams[[ykey]][1]) :
  invalid format '%d'; use format %f, %e, %g or %a for numeric objects

> traceback()
7: sprintf("GridCorner%s=%d %d\n", ff, aParams[[xkey]][1], aParams[[ykey]][1])
6: .getCelHeaderV3(header)
5: .getCelHeaderV4(header)
4: createCel(outFilename, header = hdr, overwrite = FALSE, verbose = verbose2)
3: withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
2: suppressWarnings({
       pathname <- createCel(outFilename, header = hdr, overwrite = FALSE,
           verbose = verbose2)
   })
1: convertCel("GSM1060628_1B_ML_Cindy_9-8-10.CEL", "foo.cel")

This CEL file can be downloaded as:

library("R.utils")
path <- "ftp://ftp.ncbi.nlm.nih.gov/geo/samples/GSM1060nnn/GSM1060628/suppl"
url <- file.path(path, "GSM1060628_1B_ML_Cindy_9-8-10.CEL.gz")
pathname <- gunzip(downloadFile(url))
HenrikBengtsson commented 9 years ago

Minimal reproducible example:

> library("affxparser")
> hdr <- readCcgHeader("GSM1060628_1B_ML_Cindy_9-8-10.CEL")
Warning message:
In readBin(con, what = integer(), size = 4, signed = FALSE, endian = "big",  :
  'signed = FALSE' is only valid for integers of sizes 1 and 2
> hdr3 <- affxparser:::.getCelHeaderV3(header)
Error in sprintf("GridCorner%s=%d %d\n", ff, aParams[[xkey]][1], aParams[[ykey]][1]) :
  invalid format '%d'; use format %f, %e, %g or %a for numeric objects
HenrikBengtsson commented 9 years ago

Turns out that the Calvin CEL header stores the "problematic" parameters as "floats" (not integers as one/affxparser would expect), e.g.

> value <- hdr$dataHeader$parameters[["affymetrix-algorithm-param-GridULX"]][1]
> str(value)
 num 460

which gives:

> sprintf("value=%d", value)
Error in sprintf("value=%d", value) :
  invalid format '%d'; use format %f, %e, %g or %a for numeric objects

However, sprintf() does not give this error when the value is exactly an integer, e.g. sprintf("value=%d", 460.0). This is documented in help("sprintf", package="base") as:

Numeric variables with exactly integer values will be coerced to integer.

Looking at our value above:

> str(value)
 num 460
> value == 460
[1] FALSE
> value - 460
[1] 3.051758e-05

So, our parsed value is not an exact integer (probably because it is read as a 4-byte float which is coerced to a double in R.

HenrikBengtsson commented 9 years ago

Fixed and verified that:

> affxparser::convertCel("GSM1060628_1B_ML_Cindy_9-8-10.CEL", "foo.cel")

doesn't give an error and outputs a valid v4/XDA CEL file.

Package passed R CMD check.