In our training loop, each maze in the batch is missing tokens from the end. They often do not include start or end tokens, which would make training impossible. Alex's sweeps have been using a different training loop that does not have this problem.
In our training loop, each maze in the batch is missing tokens from the end. They often do not include start or end tokens, which would make training impossible. Alex's sweeps have been using a different training loop that does not have this problem.
Example of a 3x3 5-maze dataset: