google-research / smore

Apache License 2.0
162 stars 28 forks source link

dist_l2 forward error when running train_shallow_wikikgv2.sh for WikiKG90M-v2 #7

Open chunlinli opened 2 years ago

chunlinli commented 2 years ago

Hi,

Thanks for your exciting contributions to the open-source KG framework!

I have followed the steps in README.md and README_wikikgv2.md to install the package and download the LSC WikiKG90M-v2 data. When I am trying out the baseline models, it reports the following error:

Screen Shot 2022-07-02 at 5 17 34 PM

I tried to identify the source of error, and it seems from dist_forward() function defined in extlib_cuda.cpp. I am using 4 TITAN RTX GPUs (24 GB each) with Driver Version: 440.100 and CUDA Version: 10.2.

How to fix this error? Thanks!

hyren commented 1 year ago

Hi, can you use the native_cal_logit function defined here to see whether it works? https://github.com/google-research/smore/blob/main/smore/models/vec.py#L78