Closed clintonlau closed 3 years ago
Hi @clintonlau
If I recall, that was probably to evaluate the model and print performance during training. From the keras documentation ([https://keras.io/api/models/model_training_apis/#fit-method]()):
validation_data: Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data.
The test set for this particular dataset does not have the ground truth labels (at least when I worked on it) so the dev was effectively the test when reporting results.
Oh yes, my bad I didn't mean data leakage. I guess my initial assumption was that you were using the dev set as an unseen held-out set and split the training set into train/val splits, but you used the dev set for both hyperparameter tuning and final model evaluation to report results as the test set was not available.
Thanks for the clarification.
How was the dev set used for hyperparameter tuning?
Please correct me if I am wrong, I see the dev set as being used for hyperparameter tuning as well since it was passed to the fit() method and it was used to evaluate the model after each epoch. The model was not trained on the dev set directly but you could monitor the validation results (loss, acc, etc.) during training, which indirectly influences your design decisions for the hyperparameters.
Hi @talhanai,
I hope you can help me out with a question about your trainLSTM.py code.
Particularly, I am having trouble understanding why you are using X_dev and Y_dev as both validation data and test data. By using them for both validating and testing would result in data leakage.
From reading your paper, I understand that you were only working with the training set and development set of the DAIC dataset. So here, I am assuming that
X_train
,Y_train
are from the training set andX_dev
andY_dev
are from the development set.Any insights would be very much appreciated!