Dear scholar,
I don't get the details about your gqa-ood testdev results.
I think it can be from two circumstance,
1), we trained the model in gqa balanced train sets and validation sets, then tested on gqa-ood testdevsets.
2), we trained the model in gqa balanced train sets and gqa ood-valid sets , then tested on gqa-ood testdevses.
And I think the two circumstance is all plausible. Which one is from your paper?
Dear scholar, I don't get the details about your gqa-ood testdev results. I think it can be from two circumstance, 1), we trained the model in gqa balanced train sets and validation sets, then tested on gqa-ood testdevsets. 2), we trained the model in gqa balanced train sets and gqa ood-valid sets , then tested on gqa-ood testdevses. And I think the two circumstance is all plausible. Which one is from your paper?