smonsays / hypernetwork-attention

Official code for the paper "Attention as a Hypernetwork"
MIT License
23 stars 0 forks source link

Negative OOD R2 #1

Closed jimz7 closed 1 month ago

jimz7 commented 1 month ago

I tried python run.py --config 'configs/logic.py:logic_4var_2term;linear_hypatt' and obtained negative OOD R2 (around -0.18) score while the training and testing R2 are all positive (all around 0.87) for the fuzzy logic test. Running python run.py --config 'configs/logic.py:logic_4var_2term;transformer' also gives negative OOD R2 (around -0.49). How can I fix the problem?

smonsays commented 1 month ago

Thanks for your interest in our work! Indeed the naming of the metrics on the fuzzy logic task is a bit confusing: What we call OOD in the paper (out-of-distribution combinations of terms) is reported as Test R2. What is called OOD R2 in the code is an even more difficult setting where we hold-out whole terms. For easy reference, the last paragraph on page 5 refers to this setting:

Our compositional split only contains novel combinations of known terms. A priori, it might be possible that the system learns to compose the basic fuzzy logic operations in a way that allows it to generalize to any fuzzy logic function, including those that contain novel combinations of unknown terms. However, testing on functions obtained as novel combinations of K unknown terms, we find that none of the models considered here is able to solve such a task.