Closed dankelley closed 1 year ago
Matrices will also have to be handled specially (on reconstitution) because YAML has no way to represent matrices (or at least yaml::as_yaml()
has no way to do that, and my web searching suggests that YAML has no way, either).
The workaround is to construct a matrix at the reconstitution phase. This will be necessary only in certain files (e.g. adp
files can have a rotation matrix) and so the author of reconstitution code will need to be aware of this. But those users ought to have some skill (we are not talking about using excel here) and I think it will be sufficient to (1) document the special-case items and (2) insert a class-specific explanation as a global attribute in the ncdf file.
Another possibility is to use JSON. I just did some checking and, like YAML, it cannot handle the expression
type, so a tweak will be needed on units
. However, it can handle matrices. Well, with an exception: it doesn't seem to handle matrices of raw
items:
> library(oce)
> data(adp)
> madp <- adp@metadata
> madp$codes
[,1] [,2]
[1,] 7f 7f
[2,] 00 00
[3,] 80 00
[4,] 00 01
[5,] 00 02
[6,] 00 03
[7,] 00 04
> toJSON(madp$codes)
Error in dim(m) <- dim(x) :
dims [product 14] do not match the length of object [1]
> toJSON(as.integer(madp$codes))
[127,0,128,0,0,0,0,127,0,0,1,2,3,4]
> C <- madp$codes
> C <- as.integer(madp$codes)
> dim(C) <- dim(madp$codes)
> toJSON(C, pretty=TRUE)
[
[127, 127],
[0, 0],
[128, 0],
[0, 1],
[0, 2],
[0, 3],
[0, 4]
]
I am now leaning towards JSON, rather than YAML. Here's why:
transformationMatrix
is, indeed, a matrix. And it's dimension will have to be inferred from other knowledge or from an additional item called maybe transformationMatrixDimension
. But then the code would have to use that second thing to reformat the first thing, and then remember not to include that second thing in the results.oce
doesn't use it, anyway (IIRC).I've made a test code (not pushed, and will be in a new branch called JSON), and it seems to work (click Details to see for CTD, and next issue for ADP).
Results for data("adp")
metadata
Since I'm using this issue to take notes, the following shows how to reconstitute expressions.
> ctd[["temperatureUnit"]]$unit
expression(degree * C)
> as.character(ctd[["temperatureUnit"]]$unit)
[1] "degree * C"
> parse(text=as.character(ctd[["temperatureUnit"]]$unit))
expression(degree * C)
To the JSON branch, I've added new low-level functions. This includes tests in the test suite, for ctd
and adp
built-in datasets. I will try some other datasets, to find remaining special cases. So far, the special cases involve POSIX times, which in JSON become simple character strings, unit expressions, and he codes
matrix from read.adp.rdi()
.
commit 6f88923b12ee143f30adc4ea16581100f4388d39 Author: dankelley kelley.dan@gmail.com Date: Sun Jun 18 10:21:43 2023 -0300
add json2metadata() and metadata2json()
These will be used by oce2ncdf() and ncdf2oce(), respectively.
This work has been completed in 'main' commit b7d7dd65cd3fa81d8f498e715a628d3dfae71a3b:
?ocencdf
explains some minor post hoc conversions that are required for full recovery of the metadata. (These are done in the package but are unlikely to be desired for other language approaches.)
The advantage of this is that I would be very surprised if e.g. python, julia, or any other analysis language could not translate yaml into something analogous to the
list
object in R.The following shows how this can work. Note that
as.yaml()
doesn't handleexpression
objects (which I discovered by trial and error).I'll do this tomorrow.
Output: