Accenture / AmpliGraph

Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org
Apache License 2.0
2.12k stars 252 forks source link

FocusE-ComplEx: Question on using numeric edge weight attributes - prediction performance (MRR) and embeddings are very similar between models with and without edge weights #268

Open lanhaz opened 1 year ago

lanhaz commented 1 year ago

Description

Ampligraph-Complex-FocusE (Version 1.4.0) - Support for Numeric Literals on Edges

I am comparing 2 different ComplEx Models, one that uses model.fit(X, focusE_numeric_edge_values=X_edge_values) with edge weights and one without edge weights using model.fit(X)

I'm following the Toy Example provided on AmpliGraph documentation page (see below) https://docs.ampligraph.org/en/1.4.0/generated/ampligraph.latent_features.ComplEx.html#focuse-complex

Actual Behavior

COMPARING Predictions (MRR) & Embeddings between the following two ComplEx models: (1) With Edge-weights AND (2) WithOUT Edge-weights]

Upon comparing models (1) & (2), the following behavior is observed:

  1. Quite Similar Performance (eg. MRR / MR / Hits@N) - BUT most likely due to the dataset size for this example.
  2. Very Similar Embeddings, meaning embedding differences between (1) & (2) exhibiting relative tolerance ~ +/-[0.005]
  3. Most importantly, small embedding differences when using following cases for edge-weights: 3a. all edge weights = 0's (using np.zeros) 3b. all edge weights = 1's (using np.ones) 3c. all edge weights assigned to random weights between [0,1] using np.random.rand

When using complEx+FocusE algorithms to add Edge-weights, I assume performance / embeddings should NOT be approximately the same when comparing these TWO models (1) vs (2).

Expected Behavior

After trying various edge-weights (eg. focusE_numeric_edge_values = {from Toy Example, OR all ones, OR all zeros, OR random numbers between [0,1]}), I was expecting the Embeddings between the models WITH edge-weights to be different from the models WithOUT.

In summary, across all variations of edge weights for these experiments, the MRR was very similar and the embeddings were also in terms of their relative differences compared to model trained without using edge weights.

Steps to Reproduce

Please kindly replicate the toy example using following code: Here are the Steps needed to reproduce this issue for ComplEx+FocusE Edge weightings.

======================== image image image image

=========================== <-- PYTHON CODE FILE --> PHXA-11415-Log-ISSUE-Ampligraph-ComplEx-FocusE__pythonCODE.txt