dice-group / dice-embeddings

Hardware-agnostic Framework for Large-scale Knowledge Graph Embeddings
MIT License
47 stars 13 forks source link

Adding noise in KG #17

Closed Demirrr closed 2 years ago

Demirrr commented 2 years ago

Context: Most real-world KGs contains noisy triples, yet benchmark datasets for link prediction are noisless (as far as I know there has not been a work suggesting the existence of noisey triples in WN18RR, FB15K-237, and YAGO3-10).

Feature Funtionaly We have a feature implemented to add noisy triples into the training dataset. Through add_noise_rate parameter, num_noisy_triples = int(len(self.train_set) * add_noise_rate) number of triples are constructed by randomly sampling head entitiy, relation, and tail entity.

Purpose Through adding noise in the input KG,

  1. we can validate whether any regulairazion effect[bishop1995training] can be observed.
  2. we can validate how much noise can be used as regularizer (if any)
  3. we can validate whether learning compressed KGE suffer from too much noise as much as uncompressed KGE

@article{bishop1995training, title={Training with noise is equivalent to Tikhonov regularization}, author={Bishop, Chris M}, journal={Neural computation}, volume={7}, number={1}, pages={108--116}, year={1995}, publisher={MIT Press} }

Demirrr commented 2 years ago

Done!.