Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org
Apache License 2.0
2.16k
stars
250
forks
source link
FocusE-ComplEx: Question on using numeric edge weight attributes - prediction performance (MRR) and embeddings are very similar between models with and without edge weights #268
Ampligraph-Complex-FocusE (Version 1.4.0) - Support for Numeric Literals on Edges
I am comparing 2 different ComplEx Models, one that uses model.fit(X, focusE_numeric_edge_values=X_edge_values) with edge weights and one without edge weights using model.fit(X)
COMPARING Predictions (MRR) & Embeddings between the following two ComplEx models:
(1) With Edge-weights AND
(2) WithOUT Edge-weights]
Upon comparing models (1) & (2), the following behavior is observed:
Quite Similar Performance (eg. MRR / MR / Hits@N) - BUT most likely due to the dataset size for this example.
Very Similar Embeddings, meaning embedding differences between (1) & (2) exhibiting relative tolerance ~ +/-[0.005]
Most importantly, small embedding differences when using following cases for edge-weights:
3a. all edge weights = 0's (using np.zeros)
3b. all edge weights = 1's (using np.ones)
3c. all edge weights assigned to random weights between [0,1] using np.random.rand
When using complEx+FocusE algorithms to add Edge-weights, I assume performance / embeddings should NOT be approximately the same when comparing these TWO models (1) vs (2).
Expected Behavior
After trying various edge-weights (eg. focusE_numeric_edge_values = {from Toy Example, OR all ones, OR all zeros, OR random numbers between [0,1]}), I was expecting the Embeddings between the models WITH edge-weights to be different from the models WithOUT.
In summary, across all variations of edge weights for these experiments, the MRR was very similar and the embeddings were also in terms of their relative differences compared to model trained without using edge weights.
Steps to Reproduce
Please kindly replicate the toy example using following code:
Here are the Steps needed to reproduce this issue for ComplEx+FocusE Edge weightings.
Description
Ampligraph-Complex-FocusE (Version 1.4.0) - Support for Numeric Literals on Edges
I am comparing 2 different ComplEx Models, one that uses model.fit(X, focusE_numeric_edge_values=X_edge_values) with edge weights and one without edge weights using model.fit(X)
I'm following the Toy Example provided on AmpliGraph documentation page (see below) https://docs.ampligraph.org/en/1.4.0/generated/ampligraph.latent_features.ComplEx.html#focuse-complex
Actual Behavior
COMPARING Predictions (MRR) & Embeddings between the following two ComplEx models: (1) With Edge-weights AND (2) WithOUT Edge-weights]
Upon comparing models (1) & (2), the following behavior is observed:
When using complEx+FocusE algorithms to add Edge-weights, I assume performance / embeddings should NOT be approximately the same when comparing these TWO models (1) vs (2).
Expected Behavior
After trying various edge-weights (eg. focusE_numeric_edge_values = {from Toy Example, OR all ones, OR all zeros, OR random numbers between [0,1]}), I was expecting the Embeddings between the models WITH edge-weights to be different from the models WithOUT.
In summary, across all variations of edge weights for these experiments, the MRR was very similar and the embeddings were also in terms of their relative differences compared to model trained without using edge weights.
Steps to Reproduce
Please kindly replicate the toy example using following code: Here are the Steps needed to reproduce this issue for ComplEx+FocusE Edge weightings.
========================
=========================== <-- PYTHON CODE FILE --> PHXA-11415-Log-ISSUE-Ampligraph-ComplEx-FocusE__pythonCODE.txt