pik-piam / magclass

R package | Data Class and Tools for Handling Spatial-Temporal Data
GNU Lesser General Public License v3.0
4 stars 24 forks source link

added consistency check to mbind #158

Closed tscheypidi closed 1 year ago

tscheypidi commented 1 year ago

This check detects if the overall object size (sum of length of all objects) changes during mbind and throws an error if that happens.

tscheypidi commented 1 year ago

this addresses the problem explained in issue #157. While it does not prevent the initial expansion of the object it detects the issue immediately and throws an error.

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q commented 1 year ago

All this change is doing is giving a slightly less cryptic error message, without any indication of the duplicated names (which is easy to do and greatly facilitates debugging), iff the magpie object with duplicated names is passed as the first argument:

> identical(mbind(p, mbind(p, p)), p[, , c(1:2, 1:2, 1:2)])
[1] TRUE

> identical(mbind(mbind(p, p), p), p[, , c(1:2, 1:2, 1:2)])
Error in mbind(mbind(p, p), p) : 
  Invalid object (number of values changed during mbind). Does the data contain duplicates?
In addition: Warning message:
In .dimextract(x, k, 3, pmatch = pmatch, invert = invert) :
  Your dimnames in dim=3 contain duplicates! This might lead to erronous results and bad code performance. Please try to avoid duplicates in dimnames under all circumstances!

What is the point of this warning

https://github.com/pik-piam/magclass/blob/9a06c8bb82696f520ef1d99da6b6b400f702ddc4/R/magpie-class.R#L233-L234

if there is a test enforcing exactly that behaviour?

https://github.com/pik-piam/magclass/blob/9a06c8bb82696f520ef1d99da6b6b400f702ddc4/tests/testthat/test-mbind.R#L7-L7

Either magpie objects support duplicated names (for whatever reason), or they do not. Pick one.

Then, according to your choice, either prevent objects with duplicated names from being created (like it is done for the "spatial" and "temporal" dimensions) and give an informative error message

  else {
    if (any(duplicated(elems)))
      stop("Some names occur more than once! Cannon handle this case! ",
           "Duplicated names: ",
           paste(unique(elems[duplicated(elems)]), collapse = ", "))

    output <- new("magpie", abind::abind(inputs, along = 3))
  }

or else prevent objects from exploding inside mbind()

    if (!diffdata &  ndata(inputs[[1]]) > 1) inputs[[i]] <- inputs[[i]][, , unique(getNames(inputs[[1]]))]