DeepGraphLearning / graphvite

GraphVite: A General and High-performance Graph Embedding System
https://graphvite.io
Apache License 2.0
1.22k stars 151 forks source link

Check failed: pool_size % solver->shuffle_base == 0 #58

Closed IvanSedykh closed 4 years ago

IvanSedykh commented 4 years ago

Hi, nice work, great paper! I am trying to use your library for link prediction problem and I have a problem with running colab tutorial from here, the section where configuration file is changing. I get the following error message:

 Check failed: pool_size % solver->shuffle_base == 0 Can't perform pseudo shuffle on 265000000 elements by a shuffle base of 6. Try setting the episode size to a multiple of the shuffle base
*** Check failure stack trace: ***
    @     0x7fd5f521d4dd  google::LogMessage::Fail()
    @     0x7fd5f5225071  google::LogMessage::SendToLog()
    @     0x7fd5f521cecd  google::LogMessage::Flush()
    @     0x7fd5f521e76a  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fd5f949e4ef  graphvite::GraphSampler<>::sample_random_walk()
    @     0x7fd5f931d78b  std::thread::_State_impl<>::_M_run()
    @     0x7fd609ceb19d  execute_native_thread_routine
    @     0x7fd6097146db  start_thread
    @     0x7fd60943d88f  clone 

Probably, i should change shuffle base. Could you please explain why such error has occurred, and how to fix it? Thank You! Below is my config file:

%%writefile my_config.yaml
application:
  graph

resource:
  # List of GPU ids. Default is all GPUs
  gpus: []
  # Memory limit for each GPU in bytes. Default is all available memory.
  gpu_memory_limit: auto
  # Number of CPU thread per GPU. Default is all CPUs.
  cpu_per_gpu: auto
  # Dimension of the embeddings.
  dim: 128

format:
  # String of delimiter characters. Change it if your node name contains blank character.
  delimiters: " \t\r\n"
  # Prefix of comment strings. Change it if you use comment style other than Python.
  comment: "#"

graph:
  # Path to edge list file. Each line should be one of the following
  # [node 1] [delimiter] [node 2] [comment]...
  # [node 1] [delimiter] [node 2] [delimiter] [weight] [comment]...
  # [comment]...
  # For standard datasets, you can specify them by <[dataset].[split]>.
  file_name: "edges1.csv"
  # Symmetrize the graph or not. True is recommended.
  as_undirected: true
  # Normalize the adjacency matrix or not. This may influence the performance a little.
  normalization: false

build:
  optimizer:
    # Optimizer.
    type: SGD
    # Learning rate. Default is usually reasonable.
    lr: 0.025
    # Weight decay.
    weight_decay: 0.005
    # Learning rate schedule, can be "linear" or "constant". Linear is recommended.
    schedule: linear
  # Number of partitions. Auto is recommended.
  num_partition: auto
  # Number of negative samples per positive sample.
  # Larger value results in slower training.
  # The performance may be influenced by num_negative * negative_weight.
  num_negative: 1
  # Batch size of samples in CPU-GPU transfer. Default is recommended.
  batch_size: 100000
  # Number of batches in a partition block.
  # Default is recommended.
  episode_size: auto

# # Comment out this section if not needed.
# load:
#   # Path to model file, can be "*.pkl".
#   file_name: graph.pkl

train:
  # Model, can be DeepWalk, LINE or node2vec.
  model: DeepWalk
  # Number of epochs. Default is usually reasonable for sparse graphs.
  # For dense graphs (|E| / |V| > 100), you may use smaller values.
  num_epoch: 2000
  # Resume training from a loaded model.
  resume: false
  # Weight of negative samples. Values larger than 10 may cause unstable training.
  negative_weight: 5
  # Exponent of degrees in negative sampling. Default is recommended.
  negative_sample_exponent: 0.75
  # Augmentation step. Default is usually reasonable.
  # Larger value is needed for sparser graphs.
  augmentation_step: auto
  # Return parameter and in-out parameters (node2vec). Need to be tuned on the validation set.
  p: 1
  q: 1
  # Length of each random walk. Default is recommended.
  random_walk_length: 40
  # Batch size of random walks in samplers. Default is recommended.
  random_walk_batch_size: 100
  # Log every n batches.
  log_frequency: 1000

# Comment out this section if not needed.
# evaluate:
#   # Comment out any task if not needed.
#   - task: node classification
#     # Path to node label file. Each line should be one of the following
#     # [node] [delimiter] [label] [comment]...
#     # [comment]...
#     file_name:
#     # Portions of data used for training. Each of them corresponds to one evaluation.
#     portions: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
#     # Number of trials repeated. Change it to 1 if your evaluation set is large enough.
#     times: 5

#   - task: link prediction
#     # Path to link prediction file. Each line should be
#     # [node 1] [delimiter] [node 2] [delimiter] [label]
#     # where label is 1 for positive and 0 for negative.
#     file_name:
#     # Path to filter file. If you aren't sure that training data is excluded in evaluation,
#     # you can specify the training edge list here.
#     filter_file:

# Comment out this section if not needed.
save:
  # Path to save file, can be "*.pkl".
  file_name: graph.pkl
  # Save hyperparameters or not.
  save_hyperparameter: false
KiddoZhu commented 4 years ago

Due to our current implementation, the library requires the batch size to be a multiplier of the augmentation step. While the auto augmentation step is inferred from your graph structure, it sometimes doesn't divide the batch size.

A workaround is to manually specific the batch size or augmentation step. For example, a batch size of 120000 may be compatible with augmentation step from 1 to 6.

We'll fix this in the next update.

IvanSedykh commented 4 years ago

Thank you, seems like everything is ok now