Hello!
When i finish the training and try to test the model, i find the test results are far away from the training results. Like the training average rewards is 80, the test result may runs 40.
Could you please help to explain the difference between the training and test?
Hello! When i finish the training and try to test the model, i find the test results are far away from the training results. Like the training average rewards is 80, the test result may runs 40. Could you please help to explain the difference between the training and test?
Thanks!