HenrikBengtsson / affxparser

🔬 R package: This is the Bioconductor devel version of the affxparser package.
http://bioconductor.org/packages/devel/bioc/html/affxparser.html
7 stars 3 forks source link

CORE DUMP: readCel and readCelHeader fail on 2-channel Axiom CEL file #15

Closed kmclough closed 9 years ago

kmclough commented 9 years ago

I'm running affxparser 1.36.0 on Mac OS 10.9 Mavericks. Calling the following: cel.data = readCel('Axiom_2_channel_test.CEL') causes my R interpreter to crash. The stack trace from the main thread is shown below. What's unusual about this CEL is that it contains 2 channel data (from an Axiom array). I've placed the CEL file at: ftp://ftp.llnl.gov/outgoing/mcloughlin2/Axiom_2_channel_test.CEL (note that this will go away in a couple of days, so you should copy it somewhere safe).

It may be relevant that, while apt-cel-extract works fine with this file (generating intensities for all probes on both channels), apt-cel-convert does not; it fails with the same exception (affymetrix_calvin_exceptions::DataGroupNotFoundException) as the affxparser code. So it may be an issue with the Fusion SDK.

Here's the stack trace:

Process: R [12944] Path: /Applications/R.app/Contents/MacOS/R Identifier: org.R-project.R Version: R 3.1.0 GUI 1.64 Mavericks build (6734) Code Type: X86-64 (Native) Parent Process: launchd [515] Responsible: R [12944] User ID: 55683

Date/Time: 2015-05-05 15:22:11.755 -0700 OS Version: Mac OS X 10.9.5 (13F1077) Report Version: 11 Anonymous UUID: B1024871-AE79-623C-05C9-F3E982E6A520

Sleep/Wake UUID: 209F9332-3354-4101-A9AC-B5A221B0E40C

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000

Application Specific Information: abort() called terminating with uncaught exception of type affymetrix_calvin_exceptions::DataGroupNotFoundException

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libsystem_kernel.dylib 0x00007fff86f7f866 pthread_kill + 10 1 libsystem_pthread.dylib 0x00007fff8d9d435c pthread_kill + 92 2 libsystem_c.dylib 0x00007fff89aaab1a abort + 125 3 libc++abi.dylib 0x00007fff93091f31 abort_message + 257 4 libc++abi.dylib 0x00007fff930b7952 default_terminate_handler() + 264 5 libobjc.A.dylib 0x00007fff8ec46322 _objc_terminate() + 124 6 libc++abi.dylib 0x00007fff930b51d1 std::terminate(void (*)()) + 8 7 libc++abi.dylib 0x00007fff930b4c5b cxa_throw + 124 8 affxparser.so 0x000000010af7c080 affymetrix_calvin_io::GenericData::DataSet(std::1::basic_string<wchar_t, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<wchar_t, std::1::char_traits, std::__1::allocator > const&) + 2864 9 affxparser.so 0x000000010af11a45 affymetrix_calvin_io::CelFileData::PrepareOutlierPlane() + 277 10 affxparser.so 0x000000010af133ef affymetrix_calvin_io::CelFileData::GetOutlierCoords(std::1::vector<affymetrix_calvin_utilities::XYCoord, std::1::allocator >&) + 47 11 affxparser.so 0x000000010af93128 affymetrix_fusion_io::CalvinCELDataAdapter::GetNumOutliers() + 424 12 affxparser.so 0x000000010afa9c3c affymetrix_fusion_io::FusionCELData::GetNumOutliers() + 44 13 affxparser.so 0x000000010b11baa5 R_affx_extract_cel_file_meta + 8341 14 affxparser.so 0x000000010b11bebf R_affx_get_cel_file_header + 207 15 libR.dylib 0x00000001059e59f0 do_dotcall + 368 (dotcode.c:581) 16 libR.dylib 0x0000000105a16b6b Rf_eval + 1355 (eval.c:656) 17 libR.dylib 0x0000000105a24de5 do_set + 245 (eval.c:2029) 18 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 19 libR.dylib 0x0000000105a248f2 do_begin + 514 (Rinlinedfuns.h:95) 20 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 21 libR.dylib 0x0000000105a22290 Rf_applyClosure + 1600 (eval.c:1037) 22 libR.dylib 0x0000000105a16bbd Rf_eval + 1437 (eval.c:675) 23 libR.dylib 0x0000000105a24de5 do_set + 245 (eval.c:2029) 24 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 25 libR.dylib 0x0000000105a248f2 do_begin + 514 (Rinlinedfuns.h:95) 26 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 27 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 28 libR.dylib 0x0000000105a248f2 do_begin + 514 (Rinlinedfuns.h:95) 29 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 30 libR.dylib 0x0000000105a22290 Rf_applyClosure + 1600 (eval.c:1037) 31 libR.dylib 0x0000000105a16bbd Rf_eval + 1437 (eval.c:675) 32 libR.dylib 0x0000000105a24de5 do_set + 245 (eval.c:2029) 33 libR.dylib 0x0000000105a16c05 Rf_eval + 1509 (eval.c:628) 34 libR.dylib 0x0000000105a510c6 R_ReplDLLdo1 + 406 (main.c:362) 35 org.R-project.R 0x0000000105881b57 run_REngineRmainloop + 295 36 org.R-project.R 0x0000000105875f2a -[REngine runREPL] + 138 37 org.R-project.R 0x0000000105864edf main + 815 38 libdyld.dylib 0x00007fff8f1805fd start + 1

HenrikBengtsson commented 9 years ago

Thanks for the report. I can reproduce the core dump:

> library("affxparser")
> pathname <- "rawData/affxparser,problematic/Axiom/Axiom_2_channel_test.CEL"
> file.info(pathname)$size
[1] 29457060
> digest::digest(file=pathname)
[1] "4217a8522b3e3eccb6cb155dbbbafaff"

readCel() uses readCelHeader() internally and already the latter core dumps:

> hdr <- readCelHeader(pathname)
terminate called after throwing an instance of
'affymetrix_calvin_exceptions::DataGroupNotFoundException'

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

ACTION: R / affxparser should never core dump, so the internal Fusion SDK exception (native C++ code) should at least bubble up to the R level as an error message. Never a core dump.

HenrikBengtsson commented 9 years ago

Ok, now to your actual objective: affxparser has not been prepared to read Affymetrix multi-channel CEL and CDF files such as what the Axiom technology provides. I am not even sure (== I have no idea) whether the Affymetrix Fusion SDK library supports parsing/reading such files.

I doubt that any of us will have time to add affxparser support for such files. It'll require not only support read multi-channel CEL files, but also multi-channel CDF files.

However, what seems to work is using the readCcg() parsers. I wrote those mostly to learn about the Calvin (CCG) file format and use them as internal reference functions. But they might be enough for what you're trying to do:

> library("affxparser")
> pathname <- "rawData/affxparser,problematic/Axiom/Axiom_2_channel_test.CEL"
> hdr <- readCcgHeader(pathname)
> names(hdr)
[1] "filename"   "fileHeader" "dataHeader"

and

> data <- readCcg(pathname)
[1] "fileHeader"        "genericDataHeader" "dataGroups"

ACTION: We should document that affxparser does not support multi-channel Affymetrix CEL and CDF files.

kmclough commented 9 years ago

Hi Henrik,

Thanks for looking into this. I’m working on a collaboration with Affymetrix involving the Axiom platform, so I think they would be pretty motivated to address this problem with the Fusion SDK if it doesn’t already support 2-channel files. Let me see if I can get what I need through the readCcg interface as well.

Thanks, Kevin

On May 5, 2015, at 4:44 PM, Henrik Bengtsson notifications@github.com wrote:

Ok, now to your actual objective: affxparser has not been prepared to read Affymetrix multi-channel CEL and CDF files such as what the Axiom technology provides. I am not even sure (== I have no idea) whether the Affymetrix Fusion SDK library supports parsing/reading such files.

I doubt that any of us will have time to add affxparser support for such files. It'll require not only support read multi-channel CEL files, but also multi-channel CDF files.

However, what seems to work is using the readCcg() parsers. I wrote those mostly to learn about the Calvin (CCG) file format and use them as internal reference functions. But they might be enough for what you're trying to do:

library("affxparser") pathname <- "rawData/affxparser,problematic/Axiom/Axiom_2_channel_test.CEL" hdr <- readCcgHeader(pathname) names(hdr) [1] "filename" "fileHeader" "dataHeader" and

data <- readCcg(pathname) [1] "fileHeader" "genericDataHeader" "dataGroups" — Reply to this email directly or view it on GitHub https://github.com/HenrikBengtsson/affxparser/issues/15#issuecomment-99262952.

HenrikBengtsson commented 9 years ago

FYI, affxparser 1.41.2 available on Bioc devel no longer core dumps; affxparser now catches Fusion SDK C++ exceptions and reports them up as standard R errors instead, e.g.

> library("affxparser")
> pathname <- "rawData/affxparser,problematic/Axiom/Axiom_2_channel_test.CEL"

> hdr <- readCelHeader(pathname)
Error in readCelHeader(pathname) :
  [affxparser Fusion SDK exception] Failed to parse header of CEL file: rawData/
affxparser,problematic/Axiom/Axiom_2_channel_test.CEL

> hdr <- readCel(pathname)
Error in readCelHeader(filename) :
  [affxparser Fusion SDK exception] Failed to parse header of CEL file: rawData/
affxparser,problematic/Axiom/Axiom_2_channel_test.CEL
HenrikBengtsson commented 9 years ago

@kmclough, I'm closing this issue; if you here something about Fusion SDK supporting multi-channel CEL and CDF files, please open a new feature-request issue. However, unless it's very straightforward I doubt we'll have time to add such support any time soon.

BTW, @kmclough, it's only by chance I realized who's "hiding" behind that GitHub username; may I suggest to add your full name to your GitHub profile? It's helpful on the receiving end. Cheers.

kasperdanielhansen commented 9 years ago

Great work.

Best, Kasper (Sent from my phone.)

On May 9, 2015, at 17:14, Henrik Bengtsson notifications@github.com wrote:

FYI, affxparser 1.41.2 available on Bioc devel no longer core dumps; affxparser now catches Fusion SDK C++ exceptions and reports them up as standard R errors instead, e.g.

library("affxparser") pathname <- "rawData/affxparser,problematic/Axiom/Axiom_2_channel_test.CEL"

hdr <- readCelHeader(pathname) Error in readCelHeader(pathname) : [affxparser Fusion SDK exception] Failed to parse header of CEL file: rawData/ affxparser,problematic/Axiom/Axiom_2_channel_test.CEL

hdr <- readCel(pathname) Error in readCelHeader(filename) : [affxparser Fusion SDK exception] Failed to parse header of CEL file: rawData/ affxparser,problematic/Axiom/Axiom_2_channel_test.CEL — Reply to this email directly or view it on GitHub.

kmclough commented 9 years ago

Hi Henrik and Kasper,

Well, not core-dumping is something of an improvement. FYI, the Fusion SDK has supported multi-channel CEL and CDF files since version 1.1 (October 2009). I haven’t looked at the affxparser source code, so I don’t know how complicated it would be to add this support.

However, the workaround Henrik suggested (using readCcg()) does work, so I’m going to use that in the short term. I suspect it doesn’t perform as well as a solution based on the Fusion SDK would, but it’s entirely good enough for my purposes. Thanks a lot for your help!

Kevin

On May 9, 2015, at 2:50 PM, Kasper Daniel Hansen notifications@github.com wrote:

Great work.

Best, Kasper (Sent from my phone.)

On May 9, 2015, at 17:14, Henrik Bengtsson notifications@github.com wrote:

FYI, affxparser 1.41.2 available on Bioc devel no longer core dumps; affxparser now catches Fusion SDK C++ exceptions and reports them up as standard R errors instead, e.g.

library("affxparser") pathname <- "rawData/affxparser,problematic/Axiom/Axiom_2_channel_test.CEL"

hdr <- readCelHeader(pathname) Error in readCelHeader(pathname) : [affxparser Fusion SDK exception] Failed to parse header of CEL file: rawData/ affxparser,problematic/Axiom/Axiom_2_channel_test.CEL

hdr <- readCel(pathname) Error in readCelHeader(filename) : [affxparser Fusion SDK exception] Failed to parse header of CEL file: rawData/ affxparser,problematic/Axiom/Axiom_2_channel_test.CEL — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/HenrikBengtsson/affxparser/issues/15#issuecomment-100552243.