Apparently mselect can't subset from a list of elements that is too long. Works up to ~2000 entries, but breaks down with a regex error for lists larger than that, and coupled REMIND-MAgPIE results have 5272 at the time of writing. Below is a reproducible example in the cluster, replace mat with any 3-dimensional magpie object with a large number of entries in the third dimension.
The error happens in .mselectSupport. It creates a regex that matches all the names in the list, which is then used by grep find what elements match it.
Likely, the problem with it is that this regex can become huge if many different names are passed (or if names are too long for that matter). The default grep uses the default POSIX regex implementation. As per the documentation:
The implementation shall support any regular expression that does not exceed 256 bytes in length.
It apparently works beyond that too, but the string in the example has 337741 bytes, seems to be too much. It first crashes around 2500 variable names in the test, which is a string of 180 KB. Using the PCRE regex standard with perl = TRUE in grep unfortunately yields the same result, and that standard apparently also has a limit of around 64KB in the C library implementation.
Apparently
mselect
can't subset from a list of elements that is too long. Works up to ~2000 entries, but breaks down with a regex error for lists larger than that, and coupled REMIND-MAgPIE results have 5272 at the time of writing. Below is a reproducible example in the cluster, replacemat
with any 3-dimensionalmagpie
object with a large number of entries in the third dimension.The error happens in .mselectSupport. It creates a
regex
that matches all the names in the list, which is then used bygrep
find what elements match it.Likely, the problem with it is that this regex can become huge if many different names are passed (or if names are too long for that matter). The default
grep
uses the default POSIX regex implementation. As per the documentation:It apparently works beyond that too, but the string in the example has 337741 bytes, seems to be too much. It first crashes around 2500 variable names in the test, which is a string of 180 KB. Using the PCRE regex standard with
perl = TRUE
ingrep
unfortunately yields the same result, and that standard apparently also has a limit of around 64KB in the C library implementation.