Closed gustavopuma closed 4 years ago
Could you please provide more details? Like hyperparameters output
This is my complete python code trainning : import graphvite as gv import graphvite.application as gap app1 = gap.KnowledgeGraphApplication(dim=128,gpu_memory_limit=gv.auto,cpu_per_gpu=gv.auto) app1.load(file_name=gv.dataset.wikidata5m.train) app1.build(optimizer=gv.auto,num_partition=6,episode_size=gv.auto) app1.train(model='RotatE',margin=9) app1.save_model("rotate_wikidata5m.pkl")
I tried several option of hyperparameter and with embedded dimition 512 I'm not able even to start the trainning because of memory allocation even if I put several partitions its too long or not possible to load the and execute the trainning
I tried as well the config file in : https://graphvite.io/docs/latest/_downloads/d86d91f69cdc0bdcdd521a92b6294306/rotate_wikidata5m.yaml
For 1 GPU, the best setting for acceleration is to use num_partition=1
and episode size=auto
.
For embedding performance, I guess you can follow most hyperparameters in the config file above, except changing dim
to 128 and changing adversarial temperature
to some mild value like 2.
I’m using one virtual machine with 6 CPU / 1 GPU (11GB) with 56 GB RAM . Concerning the training the dataset Wikidata5m, I tried to train the same model using the same dataset with num_partitions = auto , eipisode_size = auto , gpu_memory_limit = auto , cpu_per_gpu = auto and I able to start training only with dimensions 128 and still training after 24 hours , its expected, given the benchmark with 24 CPU 4 GPU ?