justusschock / shapenet

PyTorch implementation of "Super-Realtime Facial Landmark Detection and Shape Fitting by Deep Regression of Shape Model Parameters" predicting facial landmarks with up to 400 FPS
https://shapenet.rtfd.io
GNU Affero General Public License v3.0
342 stars 59 forks source link

pca components not fully loaded #8

Closed shuxp closed 5 years ago

shuxp commented 5 years ago

Assuming we have 50 keypoints, which are 100 numbers.

In 'preparedatasets', pca components are computed by pca.fit() with shape(100*100). the shape of pca.mean is 1*100. pca.mean_ + pca.components = shape(101, 100), which are resized to (101, 50, 2).

But in 'train_single_shapenet', we only load first 51502, which is not well loaded. Why is that? Is this we only need top 50 components? If I load full components, can I get a better result?

shapes = np.load(os.path.abspath(config_dict["layer"].pop("pca_path"))
         )["shapes"][:config_dict["layer"].pop("num_shape_params") + 1]
justusschock commented 5 years ago

Theoretically you are right. You could achieve better results with more components.

In reality, this does not work that easy. If you have a look in our paper (esp. fig. 3) the number of parameters (= number of components -1) only changes accuracy to some extend.

There are multiple reasons for this:

Due to these reasons (and a few more minor ones), we chose to use a relatively small number of components to reduce computing and memory complexity, although we have validated this.

Note: This is also a phenomenon, that can be observed in other model-based approaches like Active Appearance Models.

shuxp commented 5 years ago

Theoretically you are right. You could achieve better results with more components.

In reality, this does not work that easy. If you have a look in our paper (esp. fig. 3) the number of parameters (= number of components -1) only changes accuracy to some extend.

There are multiple reasons for this:

  • increasing the number of components leads to a huge number of possible linear combinations resulting in the same shape final shape (hard to learn ambiguities)
  • Increasing the number of components increases the trainable networks parameters and thereby the chance for overfitting (minor effect in practice)
  • The components are sorted by decreasing relevance, which means that the first 25-50 components usually cover around 95-98% of the data variance.

Due to these reasons (and a few more minor ones), we chose to use a relatively small number of components to reduce computing and memory complexity, although we have validated this.

Note: This is also a phenomenon, that can be observed in other model-based approaches like Active Appearance Models.

My datasets have a shape of rectangle with 80 tuned points. I trained the model from scratch. I think my loss is not good, whith is always larger than 2. The results are not good. I don't know what is wrong with my training. I will check my code later.

PS: I trained with cascade pyramid network(CPN), witch get a relative result. I trained with dlib, in some cases not work well.

justusschock commented 5 years ago

Are you trying to get it working on faces or other kind of data? A validation loss of 2 would be quite okay, I think (depending on how easy your shapes are). Maybe you have to play with some hyper parameters for your usecase.

shuxp commented 5 years ago

Are you trying to get it working on faces or other kind of data? A validation loss of 2 would be quite okay, I think (depending on how easy your shapes are). Maybe you have to play with some hyper parameters for your usecase.

I am not have face datasets. Maybe something goes wrong. I will check the code carefully. I will report as soon as I get good results. Thanks.

justusschock commented 5 years ago

Closing for now. Feel free to reopen, if the issue persists and please report back your results