HenrikBengtsson / affxparser

🔬 R package: This is the Bioconductor devel version of the affxparser package.
http://bioconductor.org/packages/devel/bioc/html/affxparser.html
7 stars 3 forks source link

WISH: Read Axiom CEL files #20

Open HenrikBengtsson opened 8 years ago

HenrikBengtsson commented 8 years ago

The Affymetrix Axiom technology differs substantially from classical Affymetrix microarrays. Because of this, the affxparser package cannot read CEL files produced from this technology. At this point it is not clear to me (HB) whether the underlying Affymetrix Fusion SDK, which affxparser uses, even can read these Axiom CEL files.

Affymetrix has made Exome319 CEL files are available for 90 Yoruba individuals. Trying to read one of these files using affxparser currently gives a parse error:

> library("affxparser")
> hdr <- readCelHeader("NA18500_YRI_Axiom_Exome319.CEL")
Error in readCelHeader("NA18500_YRI_Axiom_Exome319.CEL") :
  [affxparser Fusion SDK exception] Failed to parse header of CEL file: NA18500_YRI_Axiom_Exome319.CEL
HenrikBengtsson commented 8 years ago

BTW, affxparser can indeed read Axiom CDF files, e.g.

> pathname <- "annotationData/chipTypes/Axiom_Exome319/Axiom_Exome319,r1.cdf"
> hdr <- readCdfHeader(pathname)
> str(hdr)
List of 12
 $ ncols      : int 992
 $ nrows      : int 992
 $ nunits     : int 404099
 $ nqcunits   : int 4
 $ refseq     : chr ""
 $ chiptype   : chr "Axiom_Exome319"
 $ filename   : chr "annotationData/chipTypes/Axiom_Exome319/Axiom_Exome319,r1.cdf"
 $ rows       : int 992
 $ cols       : int 992
 $ probesets  : int 404099
 $ qcprobesets: int 4
 $ reference  : chr ""

> names <- readCdfUnitNames(pathname)
> str(names)
 chr [1:404099] "AFFX-KIT-000001" "AFFX-KIT-000002" "AFFX-KIT-000003" ...

> data <- readCdfUnits(pathname, units=c(1, 1024, 432100))
> data <- readCdfUnits(pathname, units=200100)
> str(data)
List of 1
 $ AX-82962479:List of 3
  ..$ type     : int 9
  ..$ direction: int 1
  ..$ groups   :List of 2
  .. ..$ NONE:List of 6
  .. .. ..$ x        : int [1:2] 262 986
  .. .. ..$ y        : int [1:2] 64 102
  .. .. ..$ pbase    : chr [1:2] "c" "c"
  .. .. ..$ tbase    : chr [1:2] "g" "g"
  .. .. ..$ expos    : int [1:2] 16 16
  .. .. ..$ direction: int 1
  .. ..$ NONE:List of 6
  .. .. ..$ x        : int [1:2] 262 986
  .. .. ..$ y        : int [1:2] 64 102
  .. .. ..$ pbase    : chr [1:2] "c" "c"
  .. .. ..$ tbase    : chr [1:2] "g" "g"
  .. .. ..$ expos    : int [1:2] 16 16
  .. .. ..$ direction: int 1
kasperdanielhansen commented 8 years ago

We're using the latest release of Fusion, but it is from 2011.

Best, Kasper

On Tue, Dec 15, 2015 at 10:21 AM, Henrik Bengtsson <notifications@github.com

wrote:

BTW, affxparser can indeed read Axiom CDF files, e.g.

pathname <- "annotationData/chipTypes/Axiom_Exome319/Axiom_Exome319,r1.cdf"> hdr <- readCdfHeader(pathname)> str(hdr)List of 12 $ ncols : int 992 $ nrows : int 992 $ nunits : int 404099 $ nqcunits : int 4 $ refseq : chr "" $ chiptype : chr "Axiom_Exome319" $ filename : chr "annotationData/chipTypes/Axiom_Exome319/Axiom_Exome319,r1.cdf" $ rows : int 992 $ cols : int 992 $ probesets : int 404099 $ qcprobesets: int 4 $ reference : chr "" names <- readCdfUnitNames(pathname)> str(names) chr [1:404099] "AFFX-KIT-000001" "AFFX-KIT-000002" "AFFX-KIT-000003" ... data <- readCdfUnits(pathname, units=c(1, 1024, 432100))> data <- readCdfUnits(pathname, units=200100)> str(data)List of 1 $ AX-82962479:List of 3 ..$ type : int 9 ..$ direction: int 1 ..$ groups :List of 2 .. ..$ NONE:List of 6 .. .. ..$ x : int [1:2] 262 986 .. .. ..$ y : int [1:2] 64 102 .. .. ..$ pbase : chr [1:2] "c" "c" .. .. ..$ tbase : chr [1:2] "g" "g" .. .. ..$ expos : int [1:2] 16 16 .. .. ..$ direction: int 1 .. ..$ NONE:List of 6 .. .. ..$ x : int [1:2] 262 986 .. .. ..$ y : int [1:2] 64 102 .. .. ..$ pbase : chr [1:2] "c" "c" .. .. ..$ tbase : chr [1:2] "g" "g" .. .. ..$ expos : int [1:2] 16 16 .. .. ..$ direction: int 1

— Reply to this email directly or view it on GitHub https://github.com/HenrikBengtsson/affxparser/issues/20#issuecomment-164796516 .

HenrikBengtsson commented 8 years ago

I forgot about the whole discussion about Axiom CEL files over in Issue #15.

For instance, Axiom CEL files can be parsed using the non-Fusion SDK function readCcg():

> library("affxparser")
> data <- readCel("NA18500_YRI_Axiom_Exome319.CEL")
Error in readCelHeader(filename) :
  [affxparser Fusion SDK exception] Failed to parse header of CEL file: ./NA18500_YRI_Axiom_Exome319.CEL
> data <- readCcg("NA18500_YRI_Axiom_Exome319.CEL")
> names(data)
[1] "fileHeader"        "genericDataHeader" "dataGroups"

Also, from one of the comments: "Fusion SDK has supported multi-channel CEL and CDF files since version 1.1 (October 2009)".

ghost commented 8 years ago

Hi, I'm working with an Axiom Human Origins Array, and I'm having serious problems in dealing with the data. I need to normalize the data and remove a putative batch effect, but I can't. I'm trying to read the CEL files but what I have is this:

data <- readCcg(pathname) Error in table[[1]] : subscript out of bounds

Do you know how can I solve this?

HenrikBengtsson commented 8 years ago

Leaving aside normalization etc, can you make the problematic Axiom CEL file available for download so I can look into the readCcg() parse issue?

ghost commented 8 years ago

I've temporary uploaded a file here: http://dropcanvas.com/cwmo4

Thank you very much

HenrikBengtsson commented 8 years ago

Link doesn't work.