DeepGraphLearning / graphvite

GraphVite: A General and High-performance Graph Embedding System
https://graphvite.io
Apache License 2.0
1.22k stars 151 forks source link

Python Interface crashes google colab #59

Closed IvanSedykh closed 4 years ago

IvanSedykh commented 4 years ago

Hi, thank you for your work! I am trying to use your library for link prediction problem and I have a problem with running colab tutorial from here, the section where python interface is used.

After running this cell:

import graphvite as gv
import graphvite.application as gap
from graphvite.graph import Graph

app = gap.GraphApplication(dim=128)
app.load(file_name='edges1.csv')
app.build()
app.train()

I have the following output:

loading graph from edges1.csv
[time] GraphApplication.load: 5.39281 s
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Graph<uint32>
------------------ Graph -------------------
#vertex: 1514713, #edge: 4685644
as undirected: yes, normalization: no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
tcmalloc: large alloc 2120007680 bytes == 0xf4cec000 @  0x7f1434d75887 0x7f140c4f1dc2 0x7f140c5fc600 0x7f140c54192d 0x7f140c537439 0x5669ac 0x5949a1 0x59fc4e 0x50d356 0x507d64 0x588d41 0x59fc4e 0x50d356 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x516345 0x50a2bf 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509042
tcmalloc: large alloc 2120007680 bytes == 0x173ae8000 @  0x7f1434d75887 0x7f140c4f1dc2 0x7f140c5fc600 0x7f140c54192d 0x7f140c537439 0x5669ac 0x5949a1 0x59fc4e 0x50d356 0x507d64 0x588d41 0x59fc4e 0x50d356 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x516345 0x50a2bf 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509042
[time] GraphApplication.build: 3.56432 s
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
GraphSolver<128, float32, uint32>
----------------- Resource -----------------
#worker: 1, #sampler: 1, #partition: 1
tied weights: no, episode size: 2650
gpu memory limit: 11.1 GiB
gpu memory cost: 1.52 GiB
----------------- Sampling -----------------
augmentation step: 6, shuffle base: 6
random walk length: 40
random walk batch size: 100
#negative: 1, negative sample exponent: 0.75
----------------- Training -----------------
model: LINE
optimizer: SGD
learning rate: 0.025, lr schedule: linear
weight decay: 0.005
#epoch: 2000, batch size: 100000
resume: no
positive reuse: 1, negative weight: 5
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Afterwards, colab restarts session. What I have done wrong? How to fix it?

Thank you!

IvanSedykh commented 4 years ago

P.S. Colab logs related to crash:

Timestamp Level Message
May 13, 2020, 9:04:02 PM WARNING WARNING:root:kernel 3b6c05df-db23-4250-a8b5-f452627594f2 restarted
May 13, 2020, 9:04:02 PM INFO KernelRestarter: restarting kernel (1/5), keep random ports
KiddoZhu commented 4 years ago

This is because the augmentation step (probably automatically inferred by the library) 6 can't divide the batch size 100. We'll fix this bug in the next update.

You can use a batch size of 120000, or use an augmentation step of 5 as a work around.

KiddoZhu commented 4 years ago

Please follow this discussion in #58