How evaluate and tune on various datasets

zhiqic / Rethinking-Counting

[CVPR 2022] Rethinking Spatial Invariance of Convolutional Networks for Object Counting

60 stars 12 forks source link

How evaluate and tune on various datasets #4

Closed adrian-dalessandro closed 1 year ago

adrian-dalessandro commented 1 year ago

I'm having trouble understanding how counting papers evaluate and tune on various datasets. For the ShanghaitechB dataset, there's no given validation set. I'm having trouble following your code, but it seems you are tracking metrics on the test set during training and then saving that model. Are you selecting the best performance directly from the test set during training, or how are you determining the epoch where you will stop? Thank you and amazing paper!

zhiqic commented 1 year ago

We have done tests for different epochs to select the best model. This part of the code is not included in the repository. You can add your own. Sorry, it's hard to include all details in one repo.

adrian-dalessandro commented 1 year ago

Thank you for the reply! Are you saying that you split the ShanghaiTech B training set into a training/validation set to determine the best epoch to stop?

I had noticed that you used the C3 crowd counting framework. They share the training curves for each model in the git repository and one thing that I noticed is that the training curves and validation curves were very noisy! How did you get around this when selecting the best epoch? Did you see similar noise during training?