Open 0xSameer opened 5 years ago
Correction in the hyperparameters description above. We are using:
-dim 200
not "dims" ...
I wanted to quickly follow up on our separate email conversation: We identified the issue and are finishing additional experiments. I will update this issue in the next days with the results. Thanks again for filing this!
I am still seeing this. Is there any update? Would be very important for everyone to know what's going on. Are the results presented in the NIPS'17 paper wrong/misleading?
Same situation here, wondering if there's any update on this?
Thank you for raising this again and sorry for the delay. In addition to our follow-ups over email we should have updated this issue on Github as well.
Basically: The reason for the stronger Euclidean baseline using the open sourced code is that the paper used a different setting where the Euclidean embeddings were regularized (similar to previous work). With open sourcing the code we disabled this regularization by default and it turned out to work better (as pointed out by Sameer). Since it led to a stronger Euclidean baseline in higher dimensions we decided to keep it like that in the code. Hyperbolic embeddings provide a substantial performance improvement in lower dimensions, which is really the main focus of this work.
Thank you for raising this again and sorry for the delay. In addition to our follow-ups over email we should have updated this issue on Github as well.
Basically: The reason for the stronger Euclidean baseline using the open sourced code is that the paper used a different setting where the Euclidean embeddings were regularized (similar to previous work). With open sourcing the code we disabled this regularization by default and it turned out to work better (as pointed out by Sameer). Since it led to a stronger Euclidean baseline in higher dimensions we decided to keep it like that in the code. Hyperbolic embeddings provide a substantial performance improvement in lower dimensions, which is really the main focus of this work.
Sorry, I'm not sure I fully understand how the Euclidean embeddings were regularized... could you please explain a little bit more? A lot of thanks.
@martinthewhale I think the idea is that since there is exponentially more "space" near the boundary of the Poincare ball, the easiest way for the algorithm to minimise the loss is to push all nodes outwards. This is a form of overfitting since we ideally want nodes that are higher in the original hierarchy to be kept closer to the centre of the ball. I believe this is achieved by regularising the norm of v in equation 6 in the paper. This means that for every parent(v)/child(u) relationship we consider we are always encouraging parents (nodes higher in the hierarchy) to stay closer to the origin.
Hi,
I was able to replicate the results for Poincare and Lorentz manifolds as reported in your publications. However, when recreating the Euclidean baselines I am noticing much stronger reconstruction scores. For example, with the following changes to ./train-nouns.sh:
After just 200 epochs, we get:
And after 1400 epochs, we get:
No other changes were made to the code. Are we doing something wrong? Note that we had to add an entry into the train-nouns.sh script for Euclidean manifold and used the same learning rate as specified for the Poincare manifold (1.0), and not the default of 1000.0 set in the code.
Thanks!