Open qxde01 opened 5 years ago
There is definetely a couple of weird things here: number of nodes is larger than the number of edges, and the graph seems almost disconnected. Are you sure you have the correct data preprocessing/data format?
There are thousands of subgraphs, and no connection between them. data preprocessing use https://github.com/xgfs/verse/tree/master/python ,and verse can train correctly. My computer has 24cpu and 256G memory. Thanks.
If most of the nodes lie in small disconnected subgraphs, node2vec will be much faster (with unknown quality). From the output I would assume that process is ran correctly, and it finishes - most likely, the printing of the progress is not called (it's only called from thread id=0).
I found different places:
If I use the parameter -nwalks 80, the training will be interrupted:
./node2vec -input rela.bscr -output embedding.bin -dim 256 -nwalks 80
and use default nwalks ,training is correct :
./node2vec -input rela.bscr -output embedding.bin -dim 256
Interesting. Could you run the default couple of times and see if it crashes anytime?
if nwalks < 40 , the training will not be interrupted.
If you post the graph file, I will try to look at the issue some time in the future. If you find a bug yourself, please notify/submit a PR.
nv: 1032440, ne: 987454 Need 0.257797 Mb for storing second-order degrees
Generating a corpus for negative samples..
Using vectorized operations lr 0.020144, Progress 19.42% Calculations took 12.13 s to run