running out of memory during app.build step

DeepGraphLearning / graphvite

GraphVite: A General and High-performance Graph Embedding System

https://graphvite.io

Apache License 2.0

1.22k stars 151 forks source link

running out of memory during app.build step #42

Closed ggerogiokas closed 4 years ago

ggerogiokas commented 4 years ago

Hi,

Just wondering if you had any tips for lowering the memory usage during the app.build step. I have a fairly large graph: ~52 million edges.

The machine I was using has ~120GB of RAM.

Thanks, George

KiddoZhu commented 4 years ago

The memory usage mostly depends on the number of nodes in the graph. How many nodes do you have?

I assume you have no more than 10 million nodes. It should be totally fine for DeepWalk and LINE with dimensions like 128 or 256. Node2vec may not work at this scale since it requires more than linear memory usage.

ggerogiokas commented 4 years ago

Thanks, a lot for the feedback. I ended up setting up the following which seems to use roughly 75% of my gpu. Do you have a rough heuristic on the relationship between batch-size, episode-size, node number, number of partitions (mine is 4.8 million nodes but 50 million edges) and memory usage in the library. Finishes fine so great work on the library btw!

app = gap.KnowledgeGraphApplication(dim=128)
app.load(file_name='cleaned.tsv')
app.build(optimizer=5e-3, num_negative=5, batch_size=100000, episode_size=1800, num_partition=4) #num_partition=auto , episode_size=100 on 128 GB RAM should take up roughly 80%
app.train(margin=9, model='DistMult', num_epoch=40, log_frequency=5)

KiddoZhu commented 4 years ago

You can find hyperparameter suggestions in the template file.

Memory of GraphVite mainly depends on embeddings O(|V| dim) and sample pools O(episode size^2 batch size). Larger epsiode sizes result in less frequent synchronization. I guess the case is because the episode size is bounded by your memory size, and smaller than the optimal one.

I think 75% is fine. If you want maximal acceleration, an experimental option is to set positive_reuse to 2 or more. This reuses the sample pool, but in quite a few cases it causes performance drop.

ggerogiokas commented 4 years ago

Thanks for the advice! Great work on the package! I will keep your tips in mind for the next run :)