Closed shuxp closed 5 years ago
Theoretically you are right. You could achieve better results with more components.
In reality, this does not work that easy. If you have a look in our paper (esp. fig. 3) the number of parameters (= number of components -1) only changes accuracy to some extend.
There are multiple reasons for this:
increasing the number of components leads to a huge number of possible linear combinations resulting in the same shape final shape (hard to learn ambiguities)
Increasing the number of components increases the trainable networks parameters and thereby the chance for overfitting (minor effect in practice)
The components are sorted by decreasing relevance, which means that the first 25-50 components usually cover around 95-98% of the data variance.
Due to these reasons (and a few more minor ones), we chose to use a relatively small number of components to reduce computing and memory complexity, although we have validated this.
Note: This is also a phenomenon, that can be observed in other model-based approaches like Active Appearance Models.
Theoretically you are right. You could achieve better results with more components.
In reality, this does not work that easy. If you have a look in our paper (esp. fig. 3) the number of parameters (= number of components -1) only changes accuracy to some extend.
There are multiple reasons for this:
- increasing the number of components leads to a huge number of possible linear combinations resulting in the same shape final shape (hard to learn ambiguities)
- Increasing the number of components increases the trainable networks parameters and thereby the chance for overfitting (minor effect in practice)
- The components are sorted by decreasing relevance, which means that the first 25-50 components usually cover around 95-98% of the data variance.
Due to these reasons (and a few more minor ones), we chose to use a relatively small number of components to reduce computing and memory complexity, although we have validated this.
Note: This is also a phenomenon, that can be observed in other model-based approaches like Active Appearance Models.
My datasets have a shape of rectangle with 80 tuned points. I trained the model from scratch. I think my loss is not good, whith is always larger than 2. The results are not good. I don't know what is wrong with my training. I will check my code later.
PS: I trained with cascade pyramid network(CPN), witch get a relative result. I trained with dlib, in some cases not work well.
Are you trying to get it working on faces or other kind of data? A validation loss of 2 would be quite okay, I think (depending on how easy your shapes are). Maybe you have to play with some hyper parameters for your usecase.
Are you trying to get it working on faces or other kind of data? A validation loss of 2 would be quite okay, I think (depending on how easy your shapes are). Maybe you have to play with some hyper parameters for your usecase.
I am not have face datasets. Maybe something goes wrong. I will check the code carefully. I will report as soon as I get good results. Thanks.
Closing for now. Feel free to reopen, if the issue persists and please report back your results
Assuming we have 50 keypoints, which are 100 numbers.
In 'preparedatasets', pca components are computed by pca.fit() with shape(100*100). the shape of pca.mean is 1*100. pca.mean_ + pca.components = shape(101, 100), which are resized to (101, 50, 2).
But in 'train_single_shapenet', we only load first 51502, which is not well loaded. Why is that? Is this we only need top 50 components? If I load full components, can I get a better result?