Open tisabe opened 1 month ago
The validation split should be autogenerated from training data, if no validation path provided. Test data should not even appear in in train.py by default, only in specific cases.
Also evaluating on train data during training should be optional, with default disabled. This would save eval time.
Pro: making runs repeatable with the same splits is safer with split up files rather than handling everything with the split files. Also easier to apply trained model to new test data.