HenrikBengtsson / illuminaio

🔬 R package: This is the Bioconductor devel version of the illuminaio package.
http://bioconductor.org/packages/devel/bioc/html/illuminaio.html
6 stars 2 forks source link

readIDAT(): Rename some of the 'Unknown.N' fields that are now known #22

Open HenrikBengtsson opened 2 years ago

HenrikBengtsson commented 2 years ago

I found [2]; it seems that they've identified/decided on what some of the "unknown" fields are. Specifically, we could rename:

In addition, we could rename:

Examples

Package IlluminaDataTestFiles

> library(illuminaio)
> file <- system.file("extdata", "idat", "4019585376_B_Red.idat", package = "IlluminaDataTestFiles")
> idat <- illuminaio::readIDAT(file)
> str(idat$Unknowns)
List of 7
 $ MostlyNull: chr "HumanCNV370-Duo_v1-0_11246591_C.bpm"  ## Manifest
 $ MostlyA   : chr "B"  ##Label or Stripe
 $ Unknown.1 : chr ""
 $ Unknown.2 : chr ""
 $ Unknown.3 : chr ""
 $ Unknown.4 : chr ""
 $ Unknown.5 : chr ""
> library(illuminaio)
> file <- system.file("extdata", "idat", "5723646052_R02C02_Grn.idat", package = "IlluminaDataTestFiles")
> idat <- illuminaio::readIDAT(file)
> str(idat$Unknowns)
List of 9
 $ MostlyNull: chr ""
 $ MostlyA   : chr "R02C02"  ##Label or Stripe
 $ Unknown.1 : chr ""
 $ Unknown.6 : int [1:2] 1 0
 $ Unknown.2 : chr ""
 $ Unknown.3 : chr ""
 $ Unknown.4 : chr ""
 $ Unknown.5 : chr ""
 $ Unknown.7 : chr ""

The other IDAT files in IlluminaDataTestFiles have zero Unknowns, e.g.

> file <- system.file("extdata", "idat", "4343238080_A_Grn.idat", package = "IlluminaDataTestFiles")
> idat <- illuminaio::readIDAT(file)
> str(idat$Unknowns)
 NULL

Package minfiDataEPIC

> library(illuminaio)
> files <- dir(system.file("extdata/200144450019", package = "minfiDataEPIC"), full.names = TRUE)
> idat <- readIDAT(files[1])
> str(idat$Unknowns)
List of 9
 $ MostlyNull: chr ""  ## Manifest
 $ MostlyA   : chr "R07C01"    ## Label or Stripe
 $ Unknown.1 : chr ""   ## OPA
 $ Unknown.6 : int [1:2] 1 0
 $ Unknown.2 : chr "WG0006776-BCDG03_NA12878"  ## SampleID
 $ Unknown.3 : chr ""  ## Description
 $ Unknown.4 : chr "WG0006776-BCD"  ## Plate
 $ Unknown.5 : chr "G03"  ## Well
 $ Unknown.7 : chr ""

References

  1. https://code.google.com/p/glu-genetics/source/browse/glu/lib/illumina.py#86: (orignal reference, no longer available)
  2. https://github.com/bioinformed/glu-genetics/blob/dcbbbf67a308d35e157b20a9c76373530510379a/glu/lib/illumina.py#L44-L61
HenrikBengtsson commented 2 years ago

For backward compatibility reasons, we probably want to keep the current Unknown.N fields as copies of the new renamed fields.

kasperdanielhansen commented 2 years ago

I would drop backwards compatibility. This is better and I doubt anyone has been using it. By "breaking" it we silently alert potential upstream users that we have improvements

On Fri, Sep 23, 2022 at 11:29 PM Henrik Bengtsson @.***> wrote:

For backward compatibility reasons, we probably want to keep the current Unknown.N fields as copies of the new renamed fields.

— Reply to this email directly, view it on GitHub https://github.com/HenrikBengtsson/illuminaio/issues/22#issuecomment-1256845457, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABF2DH6ZIBJ4O55LP5QZPELV7ZYSHANCNFSM6AAAAAAQUMIL3M . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Best, Kasper