What happens to SVM validation during pretraining?

jeffwillette commented 1 year ago

Hi. Thanks for posting this code which is really nice and easy to use.

I looked through the paper and this repo and I cannot see anywhere that shows the progress of the SVM validation on ModelNet40/ScanObjectNN features during the self-supervised pretraining. Could you possibly post a snapshot of those graphs if you have a chance?

kabouzeid commented 1 year ago

Hi, sorry for the late response!

We initially implemented SVM validation to get early feedback on the pre-training progress. However, we observed no significant correlation between SVM performance during pre-training and the downstream task performance during fine-tuning. We had many cases where the SVM performance was strong while the fine-tuning performance was weak and vice versa. As a result, we have chosen not to include these values in our report.

As evident from the screenshot, there is even a drop in SVM accuracy over the course of pre-training.

kabouzeid commented 1 year ago

For reference, our highest ever SVM validation accuracy on ModelNet40 was 93.68%, but when we fine-tuned that checkpoint the accuracy stayed below 94%.

jeffwillette commented 1 year ago

I see. Thanks for the explanation. I also noticed this, but I also noticed that the train/validation gap gets smaller for the point2vec model as the pretraining progresses. On training some variations I also noticed that I could get higher SVM performance but the train/validation gap was worse.

I am wondering if it could be the case that your best performing funetuning model had the smallest train/validation gap on the SVM when pretraining, while the model which received >93% on the SVM had a much larger validation gap. If you still have the data readily accessible I would be curious if this is the case.

Thanks

kabouzeid commented 1 year ago

Yes, I can take a look for you. What exactly do you mean by "train/validation gap"?

jeffwillette commented 1 year ago

There are two statistics which are tracked for the SVM for each dataset (ModelNet40 and ScanObjectNN) one stat is the SVM training accuracy, and the other is the SVM validation accuracy. From what I can see, when I run the code for your best model I get the following curves when validating every 100 epochs...

The SVM train/val accuracy for modelnet is 94.8/90.7 at epoch 100 while at the epoch 800 I see train/val accuracy of 93.1/90.9. This means the difference between training and validation performance is 4.1% near the beginning of training and 2.2% at the end of training meaning that the SVM features at epoch 100 are much more prone to overfitting than at epoch 800.

While I was experimenting with changing some bits of the model, I noticed I could get a model which gets higher SVM validation accuracy, but the SVM training accuracy saturates to almost 100% which implies that those features are very prone to overfitting and are probably of lower quality than the features in your best model even though they achieved higher SVM validation accuracy.

So I am wondering if the best model (the one which you present in the paper) also happened to be the one which showed the smallest gap between SVM train/validation accuracy at epoch 800. Sorry for the long explanation

kabouzeid / point2vec

What happens to SVM validation during pretraining? #3