Closed arubio2 closed 8 years ago
Before worrying about conversion, do readCdfHeader()
and readCdf()
work on this file?
readCdfHeader() -> No problem readCdf()-> Segementation fault
From: Henrik Bengtsson notifications@github.com<mailto:notifications@github.com> Reply-To: HenrikBengtsson/affxparser reply@reply.github.com<mailto:reply@reply.github.com> Date: Friday, April 1, 2016 at 17:52 To: HenrikBengtsson/affxparser affxparser@noreply.github.com<mailto:affxparser@noreply.github.com> Cc: Angel Rubio Díaz-Cordovés arubio@tecnun.es<mailto:arubio@tecnun.es> Subject: Re: [HenrikBengtsson/affxparser] Potential Bug in createCDF (#23)
Before worrying about conversion, do readCdfHeader() and readCdf() work on this file?
You are receiving this because you authored the thread. Reply to this email directly or view it on GitHubhttps://github.com/HenrikBengtsson/affxparser/issues/23#issuecomment-204619414
I can reproduce this on Windows as well. For the records, here're the details on this file:
> p <- "hta20_Hs_ENSG.cdf"
> pathname <- "hta20_Hs_ENSG.cdf"
> str(as.list(file.info(pathname)))
List of 7
$ size : num 3.34e+08
$ isdir: logi FALSE
$ mode :Class 'octmode' int 438
$ mtime: POSIXct[1:1], format: "2015-11-09 11:45:52"
$ ctime: POSIXct[1:1], format: "2016-04-01 23:22:17"
$ atime: POSIXct[1:1], format: "2016-04-01 23:22:17"
$ exe : chr "no"
> digest::digest(p, file=TRUE)
[1] "54da0300ae48837bc45e7927bed45dec"
It's a text-based CDF with header:
> str(affxparser::readCdfHeader(pathname))
List of 12
$ ncols : int 2680
$ nrows : int 2572
$ nunits : int 35321
$ nqcunits : int 0
$ refseq : chr ""
$ chiptype : chr "hta20_Hs_ENSG"
$ filename : chr "./hta20_Hs_ENSG.cdf"
$ rows : int 2572
$ cols : int 2680
$ probesets : int 35321
$ qcprobesets: int 0
$ reference : chr ""
It core dumps with:
> data <- readCdfUnits(pathname, units=1)
[core dump]
It also core dumps using the affyio package, e.g.
> data <- affyio::read.cdffile.list(pathname)
[core dump]
I would suspect this CDF file has an invalid format or is corrupt is some sense, because neither affxparser nor affyio can read the file and they are completely different code bases.
We have seen similar problems before with CDFs of this chip type, cf. https://github.com/HenrikBengtsson/affxparser/issues/18.
I'll consider this a buggy CDF unless proven otherwise.
I have tried to convert a Brainarray cdf into the binary form and it raises a segfault problem. It happened both with Linux and MacOS.
library(affxparser) convertCdf("HTA_ASv3_hta20_Hs_ENSG.cdf",”HTA_ASv3_hta20_Hs_ENSG_bin.cdf”)
And immediately, * caught segfault * address (nil), cause 'memory not mapped'
Traceback: 1: .Call("R_affx_get_cdf_file_qc", filename, as.integer(units), as.integer(verbose), returnIndices, returnXY, returnLength, returnPMInfo, returnBackgroundInfo, returnType, returnQcNumbers) 2: readCdfQc(filename) 3: convertCdf("HTA_ASv3_hta20_Hs_ENSG.cdf", "HTA_ASv3_hta20_Hs_ENSG_bin.cdf")
The cdf is donwloaded from http://mbni.org/customcdf/20.0.0/ensg.download/hta20_Hs_ENSG_20.0.0.zip
The sessionInfo is R version 3.2.2 (2015-08-14) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS release 6.7 (Final)
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] affxparser_1.42.0
Best regards,
Angel