Closed csimo005 closed 9 months ago
Yes, that is absolutely correct!
We did not perform any directed ablation of negative sampling in ViNT and this was a design choice carried over from prior work. Table 3 in the ViNG paper compares against this ablation.
In the ViNT dataloader there are negative goals that may be loaded. Is the purpose of this the give the distance prediction head some examples of very far distance and essential prevent some sort of mode collapse when training the distance prediction head? Then the action predictions are masked out, since you don't have any supervision on those action and you don't want to hurt the training of the action prediction?
Is my understanding correct and did I miss an ablation over this augmentation in the papers, because I do see an option that looks like it's there to turn this on and off, but tracing the code I can't see how it actually accomplishes this.