Negative OOD R2 - Githubissues

Thanks for your interest in our work! Indeed the naming of the metrics on the fuzzy logic task is a bit confusing: What we call OOD in the paper (out-of-distribution combinations of terms) is reported as Test R2. What is called OOD R2 in the code is an even more difficult setting where we hold-out whole terms. For easy reference, the last paragraph on page 5 refers to this setting:

Our compositional split only contains novel combinations of known terms. A priori, it might be possible that the system learns to compose the basic fuzzy logic operations in a way that allows it to generalize to any fuzzy logic function, including those that contain novel combinations of unknown terms. However, testing on functions obtained as novel combinations of K unknown terms, we find that none of the models considered here is able to solve such a task.

smonsays / hypernetwork-attention

Negative OOD R2 #1