Great work and thanks for the open source. In Atari experiments, is there any reason for setting the final token as "2 dataset_length block_size" in the code? In the Appendix, this hyperparameter is set to 2 50000 K. I didn't get the point of 2 times. I think the final token is "dataset_length * block_size". Please correct me if I have missed something. Thanks.
Great work and thanks for the open source. In Atari experiments, is there any reason for setting the final token as "2 dataset_length block_size" in the code? In the Appendix, this hyperparameter is set to 2 50000 K. I didn't get the point of 2 times. I think the final token is "dataset_length * block_size". Please correct me if I have missed something. Thanks.