HenrikBengtsson / affxparser

🔬 R package: This is the Bioconductor devel version of the affxparser package.
http://bioconductor.org/packages/devel/bioc/html/affxparser.html
7 stars 3 forks source link

readPgf(): Validate argument 'indices' against header$probesets, iff available #5

Closed HenrikBengtsson closed 9 years ago

HenrikBengtsson commented 9 years ago

Some, not all, PGF files has header field header$probesets which specifies the number of probesets. If this field is available and a valid integer (cf. Issue #4), then we should validate that elements of indices are not out of range relative to this field. Otherwise, we should validate after having read all requested probesets and give a warning if fewer probesets were read/available than requested.

> data <- readPgf("DroGene-1_0-st.pgf")
> str(data$header)
List of 15
 $ chip_type         : chr "DroGene-1_0-st"
 $ lib_set_name      : chr "DroGene-1_0-st"
 $ lib_set_version   : chr "r4"
 $ create_date       : chr "Mon Oct 15 16:34:36 PDT 2012"
 $ guid              : chr "28ac67b4-62f6-4028-dde0-5596fa61cd33"
 $ pgf_format_version: chr "1.0"
 $ num-cols          : chr "1190"
 $ num-rows          : chr "1190"
 $ probesets         : chr "176275"
...
> data <- readPgf("DroGene-1_0-st.pgf", indices=1:176276)
Warning message:
Argument 'indices' of readPgf() contained indices out of range [1,176275] which were ignored.
HenrikBengtsson commented 9 years ago

Done in branch feature/readPgf-header-coercion (commit 632f16c):

data <- readPgf("DroGene-1_0-st.pgf", indices=176276);
Error in readPgfEnv(file, readBody = TRUE, indices = indices) :
  Argument 'indices' is out of range [1,176275]

Here indices is validated immediately after reading the PGF file header, because the file header contains the field probesets. If not, then the same validation is only done after parsing the whole file.