I just want to confirm if you use the splits of the dataset provided by the Pile in training? I mean in the Pile dataset,
The Pile is provided as train, validation, and testing splits. The validation and testing components each contain 0.1% of the data, sampled uniformly at random.
Did you use the train split in the Pile to train the model directly? Did you mix and split the dataset by yourself?
Hi,
I just want to confirm if you use the splits of the dataset provided by the Pile in training? I mean in the Pile dataset,
Did you use the train split in the Pile to train the model directly? Did you mix and split the dataset by yourself?
Thank you very much! Best regards, Bo Yang