RobinHankin / disordR

https://robinhankin.github.io/disordR/
1 stars 0 forks source link

interesting error message #42

Closed RobinHankin closed 1 year ago

RobinHankin commented 1 year ago

set.seed(0) (a <- rdis()) A disord object with hash 5b7279f3c05d00cf1e8f999a755151e0451c56ec and elements [1] 9 4 7 1 2 6 3 8 5 (in some order) a[a<3] <- a[a<5] + 1000 Warning message: In jj[elements(i)] <- elements(value) : number of items to replace is not a multiple of replacement length a A disord object with hash 5b7279f3c05d00cf1e8f999a755151e0451c56ec and elements [1] 9 4 7 1004 1001 6 3 8 5 (in some order)

RobinHankin commented 1 year ago

Actually the problem is much more serious:

> a <- disord(1:9)
> a[a<3] + a[a>7]
A disord object with hash 141d1b04a56b313bb4a90429ac5b91f69fbd3748 and elements
[1]  9 11
(in some order)
> 

This should return an error, as a[a<3] has elements 1 and 2 (in some order) and a[a>7] has elements 8 and 9 (in some order) and although it returns 1+8 and 2+9 = 9,11 it could have returned 1+9 and 2+8 = 10,10. This did not return an error because the hash codes of a[a<3] and a[a>7] match. And it silently returned a result which is, at least in principle, implementation-specific. At least, I think so. In any event, the package behaviour is not documented.

RobinHankin commented 1 year ago

On reflection it is not clear to me that a[a<3] + a[a>7] above should return an error. It depends on whether you want to think of "a[a<3]" as having an independent existence; specifically, an existence that is independent of a[a>7]. Arguably, both have an order inherited from a. If you consider the order ambiguity of a[a<3] to be inherited purely from a then a[a<3] + a[a>7] should not return an error: whatever implementation-specific order a has, the sum is well-defined up to its own order. If, on the other hand, you regard the extraction as introducing a new ordering ambiguity, then a[a<3] + a[a>7] is poorly defined [above it was either 9,11 or 10,10] and should return an error. I am not explaining this very well.

RobinHankin commented 1 year ago

In any event, it is possible to make a[a<3] + a[a>7] return an error, while still allowing the package to pass R CMD check by modifying setMethod("[", signature("disord",i="disord",j="missing",drop="ANY"). Replace

out <- disord(out, hashcal(c(hash(x),hash(i))),drop=FALSE)

with

out <- disord(out, hashcal(c(hash(x),hash(i),i)),drop=FALSE)

This has the effect of including the actual index argument i in the hashcode of the returned value in something like a[a<3], so the hashcode of a[a>7] will differ from that of a[a<3] in the modified code but not the original code:

ORIGINAL code

> a <- disord(1:9)
> hash(a[a<3])
[1] "141d1b04a56b313bb4a90429ac5b91f69fbd3748"
> hash(a[a>7])
[1] "141d1b04a56b313bb4a90429ac5b91f69fbd3748"
> 

MODIFIED code

> a <- disord(1:9)
> hash(a[a<3])
[1] "6646c1560b8258de2e837e61b677b236d03ff8f9"
> hash(a[a>7])
[1] "8ab6ca951f878bf843df74a4b025acde4adf667b"
> 

Still unsure whether this is desirable or not [modified code clean under R CMD check and also passes revdep_check()]

RobinHankin commented 1 year ago

I am getting more certain that a[a<3] + a[a>7] should return an error. This on the grounds that disordR is intended to be used by packages such as mvp, not to be attached. How does this change manifest itself in mvp idiom?

Under ORIGINAL code:

> (a <- rmvp(9))
mvp object algebraically equal to
3 a b^2 c^3 d^7 f^6 + a^5 b^5 e^9 f^5 + 8 a^6 b^3 c^11 f^3 + 9 a^6 d^8 f^2 + 7
a^11 b^6 d^2 e^6 f^2 + 4 a^15 e^9 f^6 + 6 b^4 c^5 f^11 + 2 b^5 d^12 f^3 + 5 b^5
e^4 f^6
> coeffs(a)[coeffs(a)<3] + coeffs(a)[coeffs(a)>7]
A disord object with hash d81690c2d9824972ceaa64f5774de95290073820 and elements
[1]  9 11
(in some order)
> 

It is pretty clear to me that the sum above is not defined. The result (9 11) is meaningless, or at least as meaningful as 10 10. The operation corresponding to the idiom does not seem to make sense in the context of coefficients as managed by the STL map class: "take every coefficient less than three and add it to every coefficient greater than 7"...yes, but in what order? The fact that coeffs(a) is in an implementation-specific order does not seem to be "enough" here: the two addends [sic] inherit the "same" order from the hash, but the result does not seem to have any algebraic meaning.