Usability on Colab - Githubissues

Sujit-O / pykg2vec

Python library for knowledge graph embedding and representation learning.

MIT License

602 stars 109 forks source link

Usability on Colab #200

Open cytwill opened 3 years ago

cytwill commented 3 years ago

To whom it may concern,

I am wondering if anyone has tried this package on Google Colab? I currently try to test this package on Colab, but when I processed to Validate the Installation, it suggests there is no command named "pykg2vec-train".

Hope someone can help! Thanks!

Rodrigo-A-Pereira commented 3 years ago

Hi, @cytwill

Yes I'm currently using pykg2vec on google colab. The way Im'm doing it is to firstly mount my google drive onto colab using:

from google.colab import drive
drive.mount('/content/drive')

Then I clone pykg2vec into my drive:

%cd <path_to_directory_of_choice>
!git clone https://github.com/Sujit-O/pykg2vec.git`

And finally make the instalation:

%cd /content/drive/My\ Drive/.../pykg2vec
!pip install -e .

(Note: colab may ask you to restart runtime in order to load some packages)

From that point on you can use the library inside the colab notebook.

Hope this helps!

Best regards, Rodrigo Pereira

cytwill commented 3 years ago

Hi, Thanks

I have made it work on Colab. But when I was trying to modify the parameter of the negative sampling rate from 1 to 20, it was not successful. It prompts errors saying "The size of tensor a (128) must match the size of tensor b (2560) at non-singleton dimension 0" It seems that there are some code bugs, below is the error message: File "train.py", line 35, in main() File "train.py", line 28, in main trainer.train_model() File "/usr/local/lib/python3.6/dist-packages/pykg2vec/utils/trainer.py", line 227, in train_model self.train_model_epoch(cur_epoch_idx) File "/usr/local/lib/python3.6/dist-packages/pykg2vec/utils/trainer.py", line 326, in train_model_epoch loss = self.train_step_pairwise(pos_h, pos_r, pos_t, neg_h, neg_r, neg_t) File "/usr/local/lib/python3.6/dist-packages/pykg2vec/utils/trainer.py", line 164, in train_step_pairwise loss = pos_preds + self.config.margin - neg_preds RuntimeError: The size of tensor a (128) must match the size of tensor b (2560) at non-singleton dimension 0

Hope you can help me fix this, thank you so much!

Rodrigo-A-Pereira commented 3 years ago

Hi again,

Yes that error in fact occurs, however it is not completely a bug, at least for TransE (i'm am not sure what model you are using, but I'm considering transE since is the most "basic" pairwise model). On the original paper of TransE the author generates a batch with the same ammount of negative triples and positive triples. So in case you are using TransE the negative rate has to always be 1. In pairwise models that use adversarial negative sampling such as RotatE you can in fact alter the negative rate to be higher.

Pseudo code from the original TransE paper (https://papers.nips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf):

However i think that you are right in that, the error should be prevented, the most simple way to do this is to probably not allow the changing of negative rate in pairwise models that do not use such sampling(such as transE). Maybe you should open an issue so someone with more experience in the library can propse a fix.

Hope this was helpfull. :) Best regards,

Rodrigo Pereira

cytwill commented 3 years ago

Ok, thanks for your help!

baxtree commented 3 years ago

Hi, there was a command name mismatch between the README and the implementation. It has been fixed and merged. Thanks for reporting this.

louisccc commented 3 years ago

I feel it is a good choice to show off our use cases in colab, it can be more straightforward to users.