Observing very strong Euclidean baseline results for reconstruction ...

facebookresearch / poincare-embeddings

PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Other

1.68k stars 231 forks source link

Observing very strong Euclidean baseline results for reconstruction ... #35

Open 0xSameer opened 5 years ago

0xSameer commented 5 years ago

Hi,

I was able to replicate the results for Poincare and Lorentz manifolds as reported in your publications. However, when recreating the Euclidean baselines I am noticing much stronger reconstruction scores. For example, with the following changes to ./train-nouns.sh:

-manifold euclidean
-dims 200
-lr 1.0

After just 200 epochs, we get:

json_stats: {"epoch": 199, ..., "mean_rank": 1.69, "map_rank": 0.90}

And after 1400 epochs, we get:

"mean_rank": 1.19, "map_rank": 0.95

No other changes were made to the code. Are we doing something wrong? Note that we had to add an entry into the train-nouns.sh script for Euclidean manifold and used the same learning rate as specified for the Poincare manifold (1.0), and not the default of 1000.0 set in the code.

Thanks!

0xSameer commented 5 years ago

Correction in the hyperparameters description above. We are using:

-dim 200

not "dims" ...

mnick commented 5 years ago

I wanted to quickly follow up on our separate email conversation: We identified the issue and are finishing additional experiments. I will update this issue in the next days with the results. Thanks again for filing this!

HHalva commented 4 years ago

I am still seeing this. Is there any update? Would be very important for everyone to know what's going on. Are the results presented in the NIPS'17 paper wrong/misleading?

martinwhl commented 4 years ago

Same situation here, wondering if there's any update on this?

mnick commented 4 years ago

Thank you for raising this again and sorry for the delay. In addition to our follow-ups over email we should have updated this issue on Github as well.

Basically: The reason for the stronger Euclidean baseline using the open sourced code is that the paper used a different setting where the Euclidean embeddings were regularized (similar to previous work). With open sourcing the code we disabled this regularization by default and it turned out to work better (as pointed out by Sameer). Since it led to a stronger Euclidean baseline in higher dimensions we decided to keep it like that in the code. Hyperbolic embeddings provide a substantial performance improvement in lower dimensions, which is really the main focus of this work.

martinwhl commented 4 years ago

Thank you for raising this again and sorry for the delay. In addition to our follow-ups over email we should have updated this issue on Github as well.

Basically: The reason for the stronger Euclidean baseline using the open sourced code is that the paper used a different setting where the Euclidean embeddings were regularized (similar to previous work). With open sourcing the code we disabled this regularization by default and it turned out to work better (as pointed out by Sameer). Since it led to a stronger Euclidean baseline in higher dimensions we decided to keep it like that in the code. Hyperbolic embeddings provide a substantial performance improvement in lower dimensions, which is really the main focus of this work.

Sorry, I'm not sure I fully understand how the Euclidean embeddings were regularized... could you please explain a little bit more? A lot of thanks.

davecerr commented 3 years ago

@martinthewhale I think the idea is that since there is exponentially more "space" near the boundary of the Poincare ball, the easiest way for the algorithm to minimise the loss is to push all nodes outwards. This is a form of overfitting since we ideally want nodes that are higher in the original hierarchy to be kept closer to the centre of the ball. I believe this is achieved by regularising the norm of v in equation 6 in the paper. This means that for every parent(v)/child(u) relationship we consider we are always encouraging parents (nodes higher in the hierarchy) to stay closer to the origin.