Negative sampling still does KvsAll

Hello!

I have this .yaml-config file for doing an AxSearchJob:

# wnrr-rotate-negative_sampling-kl
job.type: search
search.type: ax
dataset.name: wnrr

# training settings (fixed)
train:
  max_epochs: 400
  auto_correct: True

# this is faster for smaller datasets, but does not work for some models (e.g.,
# rotate due to a pytorch issue) or for larger datasets. Change to spo in such
# cases (either here or in ax section of model config), results will not be
# affected.
negative_sampling.implementation: sp_po

# validation/evaluation settings (fixed)
valid:
  every: 5
  metric: mean_reciprocal_rank_filtered_with_test
  filter_with_test: True
  early_stopping:
    patience: 10
    min_threshold.epochs: 50
    min_threshold.metric_value: 0.05

eval:
  batch_size: 256
  metrics_per.relation_type: True

# settings for reciprocal relations (if used)
import: [rotate, reciprocal_relations_model]
reciprocal_relations_model.base_model.type: rotate

# ax settings: hyperparameter serach space
ax_search:
  num_trials: 30
  num_sobol_trials: 30 
  parameters:
      # model
    - name: model
      type: choice
      values: [rotate, reciprocal_relations_model]

    # training hyperparameters
    - name: train.batch_size
      type: choice   
      values: [128, 256, 512, 1024]
      is_ordered: True
    - name: train.type
      type: fixed
      value: negative_sampling
    - name: train.optimizer
      type: choice
      values: [Adam, Adagrad]
    - name: train.loss
      type: fixed
      value: kl
    - name: train.optimizer_args.lr     
      type: range
      bounds: [0.0003, 1.0]
      log_scale: True
    - name: train.lr_scheduler
      type: fixed
      value: ReduceLROnPlateau
    - name: train.lr_scheduler_args.mode
      type: fixed
      value: max  
    - name: train.lr_scheduler_args.factor
      type: fixed
      value: 0.95  
    - name: train.lr_scheduler_args.threshold
      type: fixed
      value: 0.0001  
    - name: train.lr_scheduler_args.patience
      type: range
      bounds: [0, 10]  

    # embedding dimension
    - name: lookup_embedder.dim
      type: choice 
      values: [128, 256, 512]
      is_ordered: True

    # embedding initialization
    - name: lookup_embedder.initialize
      type: choice
      values: [xavier_normal_, xavier_uniform_, normal_, uniform_]  
    - name: lookup_embedder.initialize_args.normal_.mean
      type: fixed
      value: 0.0
    - name: lookup_embedder.initialize_args.normal_.std
      type: range
      bounds: [0.00001, 1.0]
      log_scale: True
    - name: lookup_embedder.initialize_args.uniform_.a
      type: range
      bounds: [-1.0, -0.00001]
    - name: lookup_embedder.initialize_args.xavier_uniform_.gain
      type: fixed
      value: 1.0
    - name: lookup_embedder.initialize_args.xavier_normal_.gain
      type: fixed
      value: 1.0

    # embedding regularization
    - name: lookup_embedder.regularize
      type: choice
      values: ['', 'l3', 'l2', 'l1']
      is_ordered: True
    - name: lookup_embedder.regularize_args.weighted
      type: choice
      values: [True, False]
    - name: rotate.entity_embedder.regularize_weight
      type: range
      bounds: [1.0e-20, 1.0e-01]
      log_scale: True
    - name: rotate.relation_embedder.regularize_weight
      type: range
      bounds: [1.0e-20, 1.0e-01]
      log_scale: True

    # embedding dropout
    - name: rotate.entity_embedder.dropout
      type: range
      bounds: [-0.5, 0.5]
    - name: rotate.relation_embedder.dropout
      type: range
      bounds: [-0.5, 0.5]

    # training-type specific hyperparameters
    - name: negative_sampling.num_negatives_s #train_type: negative_sampling
      type: range                             #train_type: negative_sampling
      bounds: [1, 1000]                       #train_type: negative_sampling
      log_scale: True                         #train_type: negative_sampling
    - name: negative_sampling.num_negatives_o #train_type: negative_sampling
      type: range                             #train_type: negative_sampling
      bounds: [1, 1000]                       #train_type: negative_sampling
      log_scale: True                         #train_type: negative_sampling
    - name: rotate.l_norm
      type: choice
      values: [1.0, 2.0]
      is_ordered: True
    - name: rotate.entity_embedder.normalize.p
      type: choice
      values: [-1.0, 2.0]
    - name: rotate.relation_embedder.normalize.p
      type: choice
      values: [-1.0, 2.0]
    - name: negative_sampling.implementation
      type: fixed
      value: spo

However, when I try to run it, I get OOM issues because it still performs a KvsAll training job. How do I disable the option to run KvsAll-jobs completely in the search?

Thanks!

Here is my log output:

2021-11-25 14:02:41.337593 Using folder: /home/filco306/lib-kge-fork/local/experiments/20211125-140241-ROTATE
2021-11-25 14:02:41.337666 Configuration:
2021-11-25 14:02:41.355569   1vsAll:
2021-11-25 14:02:41.355601     class_name: TrainingJob1vsAll
2021-11-25 14:02:41.355609   KvsAll:
2021-11-25 14:02:41.355626     class_name: TrainingJobKvsAll
2021-11-25 14:02:41.355634     label_smoothing: 0.0
2021-11-25 14:02:41.355642     query_types:
2021-11-25 14:02:41.355651       _po: true
2021-11-25 14:02:41.355659       s_o: false
2021-11-25 14:02:41.355667       sp_: true
2021-11-25 14:02:41.355675   ax_search:
2021-11-25 14:02:41.355684     class_name: AxSearchJob
2021-11-25 14:02:41.355692     num_sobol_trials: 30
2021-11-25 14:02:41.355700     num_trials: 30
2021-11-25 14:02:41.355708     parameter_constraints: []
2021-11-25 14:02:41.355716     parameters:
2021-11-25 14:02:41.355724     - name: model
2021-11-25 14:02:41.355732       type: choice
2021-11-25 14:02:41.355740       values:
2021-11-25 14:02:41.355749       - rotate
2021-11-25 14:02:41.355757       - reciprocal_relations_model
2021-11-25 14:02:41.355765     - is_ordered: true
2021-11-25 14:02:41.355773       name: train.batch_size
2021-11-25 14:02:41.355781       type: choice
2021-11-25 14:02:41.355789       values:
2021-11-25 14:02:41.355797       - 128
2021-11-25 14:02:41.355805       - 256
2021-11-25 14:02:41.355813       - 512
2021-11-25 14:02:41.355821       - 1024
2021-11-25 14:02:41.355830     - name: train.type
2021-11-25 14:02:41.355838       type: fixed
2021-11-25 14:02:41.355846       value: negative_sampling
2021-11-25 14:02:41.355854     - name: train.optimizer
2021-11-25 14:02:41.355862       type: choice
2021-11-25 14:02:41.355870       values:
2021-11-25 14:02:41.355878       - Adam
2021-11-25 14:02:41.355886       - Adagrad
2021-11-25 14:02:41.355894     - name: train.loss
2021-11-25 14:02:41.355902       type: fixed
2021-11-25 14:02:41.355910       value: kl
2021-11-25 14:02:41.355918     - bounds:
2021-11-25 14:02:41.355927       - 0.0003
2021-11-25 14:02:41.355935       - 1.0
2021-11-25 14:02:41.355943       log_scale: true
2021-11-25 14:02:41.355951       name: train.optimizer_args.lr
2021-11-25 14:02:41.355959       type: range
2021-11-25 14:02:41.355967     - name: train.lr_scheduler
2021-11-25 14:02:41.355975       type: fixed
2021-11-25 14:02:41.355983       value: ReduceLROnPlateau
2021-11-25 14:02:41.355992     - name: train.lr_scheduler_args.mode
2021-11-25 14:02:41.356000       type: fixed
2021-11-25 14:02:41.356008       value: max
2021-11-25 14:02:41.356016     - name: train.lr_scheduler_args.factor
2021-11-25 14:02:41.356024       type: fixed
2021-11-25 14:02:41.356032       value: 0.95
2021-11-25 14:02:41.356040     - name: train.lr_scheduler_args.threshold
2021-11-25 14:02:41.356048       type: fixed
2021-11-25 14:02:41.356057       value: 0.0001
2021-11-25 14:02:41.356065     - bounds:
2021-11-25 14:02:41.356075       - 0
2021-11-25 14:02:41.356084       - 10
2021-11-25 14:02:41.356092       name: train.lr_scheduler_args.patience
2021-11-25 14:02:41.356100       type: range
2021-11-25 14:02:41.356109     - is_ordered: true
2021-11-25 14:02:41.356117       name: lookup_embedder.dim
2021-11-25 14:02:41.356125       type: choice
2021-11-25 14:02:41.356134       values:
2021-11-25 14:02:41.356142       - 128
2021-11-25 14:02:41.356150       - 256
2021-11-25 14:02:41.356158       - 512
2021-11-25 14:02:41.356166     - name: lookup_embedder.initialize
2021-11-25 14:02:41.356187       type: choice
2021-11-25 14:02:41.356196       values:
2021-11-25 14:02:41.356204       - xavier_normal_
2021-11-25 14:02:41.356212       - xavier_uniform_
2021-11-25 14:02:41.356220       - normal_
2021-11-25 14:02:41.356228       - uniform_
2021-11-25 14:02:41.356237     - name: lookup_embedder.initialize_args.normal_.mean
2021-11-25 14:02:41.356245       type: fixed
2021-11-25 14:02:41.356253       value: 0.0
2021-11-25 14:02:41.356261     - bounds:
2021-11-25 14:02:41.356270       - 1.0e-05
2021-11-25 14:02:41.356278       - 1.0
2021-11-25 14:02:41.356287       log_scale: true
2021-11-25 14:02:41.356295       name: lookup_embedder.initialize_args.normal_.std
2021-11-25 14:02:41.356303       type: range
2021-11-25 14:02:41.356311     - bounds:
2021-11-25 14:02:41.356319       - -1.0
2021-11-25 14:02:41.356327       - -1.0e-05
2021-11-25 14:02:41.356338       name: lookup_embedder.initialize_args.uniform_.a
2021-11-25 14:02:41.356345       type: range
2021-11-25 14:02:41.356352     - name: lookup_embedder.initialize_args.xavier_uniform_.gain
2021-11-25 14:02:41.356359       type: fixed
2021-11-25 14:02:41.356366       value: 1.0
2021-11-25 14:02:41.356372     - name: lookup_embedder.initialize_args.xavier_normal_.gain
2021-11-25 14:02:41.356380       type: fixed
2021-11-25 14:02:41.356394       value: 1.0
2021-11-25 14:02:41.356402     - is_ordered: true
2021-11-25 14:02:41.356429       name: lookup_embedder.regularize
2021-11-25 14:02:41.356438       type: choice
2021-11-25 14:02:41.356447       values:
2021-11-25 14:02:41.356456       - ''
2021-11-25 14:02:41.356468       - l3
2021-11-25 14:02:41.356481       - l2
2021-11-25 14:02:41.356493       - l1
2021-11-25 14:02:41.356506     - name: lookup_embedder.regularize_args.weighted
2021-11-25 14:02:41.356520       type: choice
2021-11-25 14:02:41.356535       values:
2021-11-25 14:02:41.356548       - true
2021-11-25 14:02:41.356561       - false
2021-11-25 14:02:41.356575     - bounds:
2021-11-25 14:02:41.356591       - 1.0e-20
2021-11-25 14:02:41.356604       - 0.1
2021-11-25 14:02:41.356618       log_scale: true
2021-11-25 14:02:41.356631       name: rotate.entity_embedder.regularize_weight
2021-11-25 14:02:41.356645       type: range
2021-11-25 14:02:41.356659     - bounds:
2021-11-25 14:02:41.356673       - 1.0e-20
2021-11-25 14:02:41.356689       - 0.1
2021-11-25 14:02:41.356704       log_scale: true
2021-11-25 14:02:41.356719       name: rotate.relation_embedder.regularize_weight
2021-11-25 14:02:41.356732       type: range
2021-11-25 14:02:41.356747     - bounds:
2021-11-25 14:02:41.356761       - -0.5
2021-11-25 14:02:41.356775       - 0.5
2021-11-25 14:02:41.356790       name: rotate.entity_embedder.dropout
2021-11-25 14:02:41.356805       type: range
2021-11-25 14:02:41.356820     - bounds:
2021-11-25 14:02:41.356837       - -0.5
2021-11-25 14:02:41.356853       - 0.5
2021-11-25 14:02:41.356868       name: rotate.relation_embedder.dropout
2021-11-25 14:02:41.356884       type: range
2021-11-25 14:02:41.356900     - bounds:
2021-11-25 14:02:41.356914       - 1
2021-11-25 14:02:41.356927       - 1000
2021-11-25 14:02:41.356942       log_scale: true
2021-11-25 14:02:41.356956       name: negative_sampling.num_negatives_s
2021-11-25 14:02:41.356971       type: range
2021-11-25 14:02:41.356986     - bounds:
2021-11-25 14:02:41.357001       - 1
2021-11-25 14:02:41.357016       - 1000
2021-11-25 14:02:41.357031       log_scale: true
2021-11-25 14:02:41.357046       name: negative_sampling.num_negatives_o
2021-11-25 14:02:41.357061       type: range
2021-11-25 14:02:41.357076     - is_ordered: true
2021-11-25 14:02:41.357090       name: rotate.l_norm
2021-11-25 14:02:41.357104       type: choice
2021-11-25 14:02:41.357120       values:
2021-11-25 14:02:41.357135       - 1.0
2021-11-25 14:02:41.357149       - 2.0
2021-11-25 14:02:41.357164     - name: rotate.entity_embedder.normalize.p
2021-11-25 14:02:41.357179       type: choice
2021-11-25 14:02:41.357193       values:
2021-11-25 14:02:41.357208       - -1.0
2021-11-25 14:02:41.357223       - 2.0
2021-11-25 14:02:41.357237     - name: rotate.relation_embedder.normalize.p
2021-11-25 14:02:41.357252       type: choice
2021-11-25 14:02:41.357266       values:
2021-11-25 14:02:41.357280       - -1.0
2021-11-25 14:02:41.357295       - 2.0
2021-11-25 14:02:41.357310     - name: negative_sampling.implementation
2021-11-25 14:02:41.357324       type: fixed
2021-11-25 14:02:41.357339       value: spo
2021-11-25 14:02:41.357353     sobol_seed: 0
2021-11-25 14:02:41.357367   console:
2021-11-25 14:02:41.357382     format: {}
2021-11-25 14:02:41.357396     quiet: false
2021-11-25 14:02:41.357411   conve:
2021-11-25 14:02:41.357426     2D_aspect_ratio: 2
2021-11-25 14:02:41.357440     class_name: ConvE
2021-11-25 14:02:41.357455     convolution_bias: true
2021-11-25 14:02:41.357506     entity_embedder:
2021-11-25 14:02:41.357522       +++: +++
2021-11-25 14:02:41.357537       dropout: 0.2
2021-11-25 14:02:41.357551       type: lookup_embedder
2021-11-25 14:02:41.357566     feature_map_dropout: 0.2
2021-11-25 14:02:41.357580     filter_size: 3
2021-11-25 14:02:41.357595     padding: 0
2021-11-25 14:02:41.357610     projection_dropout: 0.3
2021-11-25 14:02:41.357625     relation_embedder:
2021-11-25 14:02:41.357639       +++: +++
2021-11-25 14:02:41.357655       dropout: 0.2
2021-11-25 14:02:41.357669       type: lookup_embedder
2021-11-25 14:02:41.357684     round_dim: false
2021-11-25 14:02:41.357699     stride: 1
2021-11-25 14:02:41.357714   dataset:
2021-11-25 14:02:41.357727     +++: +++
2021-11-25 14:02:41.357741     files:
2021-11-25 14:02:41.357755       +++: +++
2021-11-25 14:02:41.357770       entity_ids:
2021-11-25 14:02:41.357784         filename: entity_ids.del
2021-11-25 14:02:41.357805         type: map
2021-11-25 14:02:41.357819       entity_strings:
2021-11-25 14:02:41.357834         filename: entity_ids.del
2021-11-25 14:02:41.357849         type: map
2021-11-25 14:02:41.357864       relation_ids:
2021-11-25 14:02:41.357878         filename: relation_ids.del
2021-11-25 14:02:41.357893         type: map
2021-11-25 14:02:41.357908       relation_strings:
2021-11-25 14:02:41.357923         filename: relation_ids.del
2021-11-25 14:02:41.357949         type: map
2021-11-25 14:02:41.357963       test:
2021-11-25 14:02:41.357977         filename: test.del
2021-11-25 14:02:41.357990         type: triples
2021-11-25 14:02:41.358004       train:
2021-11-25 14:02:41.358018         filename: train.del
2021-11-25 14:02:41.358031         type: triples
2021-11-25 14:02:41.358045       valid:
2021-11-25 14:02:41.358059         filename: valid.del
2021-11-25 14:02:41.358073         type: triples
2021-11-25 14:02:41.358087     name: wnrr
2021-11-25 14:02:41.358101     num_entities: -1
2021-11-25 14:02:41.358114     num_relations: -1
2021-11-25 14:02:41.358127     pickle: true
2021-11-25 14:02:41.358139   entity_ranking:
2021-11-25 14:02:41.358153     chunk_size: -1
2021-11-25 14:02:41.358166     class_name: EntityRankingJob
2021-11-25 14:02:41.358179     filter_splits:
2021-11-25 14:02:41.358191     - train
2021-11-25 14:02:41.358204     - valid
2021-11-25 14:02:41.358234     filter_with_test: true
2021-11-25 14:02:41.358248     hits_at_k_s:
2021-11-25 14:02:41.358263     - 1
2021-11-25 14:02:41.358277     - 3
2021-11-25 14:02:41.358291     - 10
2021-11-25 14:02:41.358307     - 50
2021-11-25 14:02:41.358321     - 100
2021-11-25 14:02:41.358337     - 200
2021-11-25 14:02:41.358352     - 300
2021-11-25 14:02:41.358367     - 400
2021-11-25 14:02:41.358382     - 500
2021-11-25 14:02:41.358397     - 1000
2021-11-25 14:02:41.358412     metrics_per:
2021-11-25 14:02:41.358427       argument_frequency: false
2021-11-25 14:02:41.358442       head_and_tail: false
2021-11-25 14:02:41.358457       relation_type: true
2021-11-25 14:02:41.358472     tie_handling: rounded_mean_rank
2021-11-25 14:02:41.358487   eval:
2021-11-25 14:02:41.358502     batch_size: 256
2021-11-25 14:02:41.358518     num_workers: 0
2021-11-25 14:02:41.358534     pin_memory: false
2021-11-25 14:02:41.358549     split: valid
2021-11-25 14:02:41.358564     trace_level: epoch
2021-11-25 14:02:41.358578     type: entity_ranking
2021-11-25 14:02:41.358593   grid_search:
2021-11-25 14:02:41.358608     class_name: GridSearchJob
2021-11-25 14:02:41.358623     parameters:
2021-11-25 14:02:41.358637       +++: +++
2021-11-25 14:02:41.358652     run: true
2021-11-25 14:02:41.358666   import:
2021-11-25 14:02:41.358682   - rotate
2021-11-25 14:02:41.358696   - reciprocal_relations_model
2021-11-25 14:02:41.358711   job:
2021-11-25 14:02:41.358726     device: cuda
2021-11-25 14:02:41.358740     type: search
2021-11-25 14:02:41.358754   lookup_embedder:
2021-11-25 14:02:41.358769     class_name: LookupEmbedder
2021-11-25 14:02:41.358784     dim: 100
2021-11-25 14:02:41.358798     dropout: 0.0
2021-11-25 14:02:41.358813     initialize: normal_
2021-11-25 14:02:41.358827     initialize_args:
2021-11-25 14:02:41.358842       +++: +++
2021-11-25 14:02:41.358857     normalize:
2021-11-25 14:02:41.358872       p: -1.0
2021-11-25 14:02:41.358886     pretrain:
2021-11-25 14:02:41.358901       ensure_all: false
2021-11-25 14:02:41.358915       model_filename: ''
2021-11-25 14:02:41.358929     regularize: lp
2021-11-25 14:02:41.358944     regularize_args:
2021-11-25 14:02:41.358958       +++: +++
2021-11-25 14:02:41.358973       p: 2
2021-11-25 14:02:41.358987       weighted: false
2021-11-25 14:02:41.359002     regularize_weight: 0.0
2021-11-25 14:02:41.359016     round_dim_to: []
2021-11-25 14:02:41.359030     sparse: false
2021-11-25 14:02:41.359045   manual_search:
2021-11-25 14:02:41.359059     class_name: ManualSearchJob
2021-11-25 14:02:41.359073     configurations: []
2021-11-25 14:02:41.359088     run: true
2021-11-25 14:02:41.359103   model: ''
2021-11-25 14:02:41.359118   modules:
2021-11-25 14:02:41.359132   - kge.job
2021-11-25 14:02:41.359147   - kge.model
2021-11-25 14:02:41.359163   - kge.model.embedder
2021-11-25 14:02:41.359177   negative_sampling:
2021-11-25 14:02:41.359191     class_name: TrainingJobNegativeSampling
2021-11-25 14:02:41.359206     filtering:
2021-11-25 14:02:41.359220       implementation: fast_if_available
2021-11-25 14:02:41.359235       o: false
2021-11-25 14:02:41.359249       p: false
2021-11-25 14:02:41.359263       s: false
2021-11-25 14:02:41.359278       split: ''
2021-11-25 14:02:41.359292     frequency:
2021-11-25 14:02:41.359307       smoothing: 1
2021-11-25 14:02:41.359321     implementation: batch
2021-11-25 14:02:41.359336     num_samples:
2021-11-25 14:02:41.359350       o: -1
2021-11-25 14:02:41.359364       p: 0
2021-11-25 14:02:41.359378       s: 3
2021-11-25 14:02:41.359393     sampling_type: uniform
2021-11-25 14:02:41.359408     shared: false
2021-11-25 14:02:41.359421     shared_type: default
2021-11-25 14:02:41.359436     with_replacement: true
2021-11-25 14:02:41.359451   random_seed:
2021-11-25 14:02:41.359465     default: -1
2021-11-25 14:02:41.359479     numba: -1
2021-11-25 14:02:41.359493     numpy: -1
2021-11-25 14:02:41.359508     python: -1
2021-11-25 14:02:41.359522     torch: -1
2021-11-25 14:02:41.359536   reciprocal_relations_model:
2021-11-25 14:02:41.359551     base_model:
2021-11-25 14:02:41.359565       +++: +++
2021-11-25 14:02:41.359579       type: rotate
2021-11-25 14:02:41.359594     class_name: ReciprocalRelationsModel
2021-11-25 14:02:41.359608   rotate:
2021-11-25 14:02:41.359622     class_name: RotatE
2021-11-25 14:02:41.359644     entity_embedder:
2021-11-25 14:02:41.359658       +++: +++
2021-11-25 14:02:41.359673       type: lookup_embedder
2021-11-25 14:02:41.359687     l_norm: 1.0
2021-11-25 14:02:41.359702     normalize_phases: true
2021-11-25 14:02:41.359716     relation_embedder:
2021-11-25 14:02:41.359730       +++: +++
2021-11-25 14:02:41.359746       dim: -1
2021-11-25 14:02:41.359760       initialize: uniform_
2021-11-25 14:02:41.359774       initialize_args:
2021-11-25 14:02:41.359788         uniform_:
2021-11-25 14:02:41.359802           a: -3.14159265359
2021-11-25 14:02:41.359818           b: 3.14159265359
2021-11-25 14:02:41.359832       type: lookup_embedder
2021-11-25 14:02:41.359846   search:
2021-11-25 14:02:41.359856     device_pool: []
2021-11-25 14:02:41.359865     num_workers: 1
2021-11-25 14:02:41.359874     on_error: abort
2021-11-25 14:02:41.359882     type: ax_search
2021-11-25 14:02:41.359891   train:
2021-11-25 14:02:41.359901     abort_on_nan: true
2021-11-25 14:02:41.359909     auto_correct: true
2021-11-25 14:02:41.359918     batch_size: 100
2021-11-25 14:02:41.359927     checkpoint:
2021-11-25 14:02:41.359935       every: 5
2021-11-25 14:02:41.359945       keep: 3
2021-11-25 14:02:41.359954       keep_init: true
2021-11-25 14:02:41.359963     loss: kl
2021-11-25 14:02:41.359972     loss_arg: .nan
2021-11-25 14:02:41.359981     lr_scheduler: ''
2021-11-25 14:02:41.359989     lr_scheduler_args:
2021-11-25 14:02:41.360023       +++: +++
2021-11-25 14:02:41.360032     lr_warmup: 0
2021-11-25 14:02:41.360041     max_epochs: 400
2021-11-25 14:02:41.360051     num_workers: 0
2021-11-25 14:02:41.360060     optimizer:
2021-11-25 14:02:41.360070       +++: +++
2021-11-25 14:02:41.360078       default:
2021-11-25 14:02:41.360087         args:
2021-11-25 14:02:41.360096           +++: +++
2021-11-25 14:02:41.360104         type: Adagrad
2021-11-25 14:02:41.360113     pin_memory: false
2021-11-25 14:02:41.360122     split: train
2021-11-25 14:02:41.360130     subbatch_auto_tune: false
2021-11-25 14:02:41.360139     subbatch_size: -1
2021-11-25 14:02:41.360148     trace_level: epoch
2021-11-25 14:02:41.360156     type: KvsAll <!--- HERE!-->
2021-11-25 14:02:41.360166     visualize_graph: false
2021-11-25 14:02:41.360174   training_loss:
2021-11-25 14:02:41.360183     class_name: TrainingLossEvaluationJob
2021-11-25 14:02:41.360192   user:
2021-11-25 14:02:41.360201     +++: +++
2021-11-25 14:02:41.360211   valid:
2021-11-25 14:02:41.360220     early_stopping:
2021-11-25 14:02:41.360228       patience: 10
2021-11-25 14:02:41.360237       threshold:
2021-11-25 14:02:41.360246         epochs: 50
2021-11-25 14:02:41.360255         metric_value: 0.05
2021-11-25 14:02:41.360263     every: 5
2021-11-25 14:02:41.360278     metric: mean_reciprocal_rank_filtered_with_test
2021-11-25 14:02:41.360287     metric_expr: float("nan")
2021-11-25 14:02:41.360296     metric_max: true
2021-11-25 14:02:41.360306     split: valid
2021-11-25 14:02:41.360315     trace_level: epoch
2021-11-25 14:02:41.373541   git commit: 79a857f
2021-11-25 14:02:41.373793 Loading configuration of dataset wnrr from /home/filco306/lib-kge-fork/data/wnrr ...
2021-11-25 14:02:41.383877 Loaded 41105 keys from map entity_ids
2021-11-25 14:02:41.384033 Loaded 11 keys from map relation_ids
2021-11-25 14:02:41.385474 Loaded 86835 train triples
2021-11-25 14:02:41.385765 Loaded 3034 valid triples
2021-11-25 14:02:41.386038 Loaded 3134 test triples
2021-11-25 14:02:41.386328 [6361cf82] Using device pool: ['cuda']
2021-11-25 14:02:41.420615 [6361cf82] ax search initialized with GenerationStrategy(name='Sobol+GPEI', steps=[Sobol for 30 trials, GPEI for subsequent trials])
2021-11-25 14:02:41.576644 [6361cf82] Registering trial 0/29...
2021-11-25 14:02:41.599978 [6361cf82] Created trial 00000 with parameters: {'train.batch_size': 512, 'train.optimizer_args.lr': 0.0038263341389451326, 'train.lr_scheduler_args.patience': 10, 'lookup_embedder.dim': 128, 'lookup_embedder.initialize_args.normal_.std': 0.2594381804418265, 'lookup_embedder.initialize_args.uniform_.a': -0.11983964847803108, 'lookup_embedder.regularize': 'l1', 'rotate.entity_embedder.regularize_weight': 8.497092712317215e-14, 'rotate.relation_embedder.regularize_weight': 0.012227634287835265, 'rotate.entity_embedder.dropout': -0.27799247205257416, 'rotate.relation_embedder.dropout': 0.19008886814117432, 'negative_sampling.num_negatives_s': 980, 'negative_sampling.num_negatives_o': 59, 'rotate.l_norm': 1.0, 'model': 'reciprocal_relations_model', 'train.optimizer': 'Adagrad', 'lookup_embedder.initialize': 'xavier_normal_', 'lookup_embedder.regularize_args.weighted': True, 'rotate.entity_embedder.normalize.p': 2.0, 'rotate.relation_embedder.normalize.p': 2.0, 'train.type': 'negative_sampling', 'train.loss': 'kl', 'train.lr_scheduler': 'ReduceLROnPlateau', 'train.lr_scheduler_args.mode': 'max', 'train.lr_scheduler_args.factor': 0.95, 'train.lr_scheduler_args.threshold': 0.0001, 'lookup_embedder.initialize_args.normal_.mean': 0.0, 'lookup_embedder.initialize_args.xavier_uniform_.gain': 1.0, 'lookup_embedder.initialize_args.xavier_normal_.gain': 1.0, 'negative_sampling.implementation': 'spo'}
2021-11-25 14:02:41.621957 [6361cf82] Saving checkpoint to /home/filco306/lib-kge-fork/local/experiments/20211125-140241-ROTATE/checkpoint_00001.pt...
2021-11-25 14:02:41.622286 [6361cf82] Starting training job /home/filco306/lib-kge-fork/local/experiments/20211125-140241-ROTATE/00000 (1/30) on device cuda...
2021-11-25 14:03:45.624664 [6361cf82]   62633 distinct sp pairs in train
2021-11-25 14:03:45.659191 [6361cf82]   40996 distinct po pairs in train
2021-11-25 14:03:45.662284 [6361cf82]   2916 distinct sp pairs in valid
2021-11-25 14:03:45.664440 [6361cf82]   2646 distinct po pairs in valid
2021-11-25 14:03:45.666692 [6361cf82]   3022 distinct sp pairs in test
2021-11-25 14:03:45.668840 [6361cf82]   2694 distinct po pairs in test
2021-11-25 14:03:46.469058 [6361cf82] Trial 00000 failed: RuntimeError('The following operation failed in the TorchScript interpreter.\nTraceback of TorchScript (most recent call last):\n  File "/home/filco306/lib-kge-fork/kge/model/rotate.py", line 201, in abs_complex\n    "Compute magnitude of given complex numbers"\n    x_re_im = torch.stack((x_re, x_im), dim=0)  # dim0: real, imaginary\n    return torch.norm(x_re_im, dim=0)  # sqrt(real^2+imaginary^2)\n           ~~~~~~~~~~ <--- HERE\n  File "/home/filco306/.envs/libkge-env/lib/python3.8/site-packages/torch/functional.py", line 1333, in norm\n                _dim = list(range(ndim))\n            if out is None:\n                return _VF.frobenius_norm(input, _dim, keepdim=keepdim)\n                       ~~~~~~~~~~~~~~~~~~ <--- HERE\n            else:\n                return _VF.frobenius_norm(input, _dim, keepdim=keepdim, out=out)\nRuntimeError: CUDA out of memory. Tried to allocate 2.51 GiB (GPU 0; 15.72 GiB total capacity; 12.63 GiB already allocated; 1.95 GiB free; 12.71 GiB reserved in total by PyTorch)\n')
2021-11-25 14:03:46.469158 [6361cf82] Aborting search due to failure of trial 00000
2021-11-25 14:03:46.470588 [6361cf82] Traceback (most recent call last):
2021-11-25 14:03:46.470597 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/cli.py", line 285, in main
2021-11-25 14:03:46.470600 [6361cf82]     job.run()
2021-11-25 14:03:46.470603 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/job.py", line 159, in run
2021-11-25 14:03:46.470605 [6361cf82]     result = self._run()
2021-11-25 14:03:46.470608 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/search_auto.py", line 160, in _run
2021-11-25 14:03:46.470610 [6361cf82]     self.submit_task(
2021-11-25 14:03:46.470613 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/search.py", line 70, in submit_task
2021-11-25 14:03:46.470616 [6361cf82]     self.ready_task_results.append(task(task_arg, device=self.free_devices[0]))
2021-11-25 14:03:46.470618 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/search.py", line 232, in _run_train_job
2021-11-25 14:03:46.470621 [6361cf82]     raise e
2021-11-25 14:03:46.470624 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/search.py", line 186, in _run_train_job
2021-11-25 14:03:46.470626 [6361cf82]     job.run()
2021-11-25 14:03:46.470629 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/job.py", line 159, in run
2021-11-25 14:03:46.470631 [6361cf82]     result = self._run()
2021-11-25 14:03:46.470633 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/train.py", line 224, in _run
2021-11-25 14:03:46.470636 [6361cf82]     trace_entry = self.valid_job.run()
2021-11-25 14:03:46.470639 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/job.py", line 159, in run
2021-11-25 14:03:46.470641 [6361cf82]     result = self._run()
2021-11-25 14:03:46.470643 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/eval.py", line 67, in _run
2021-11-25 14:03:46.470646 [6361cf82]     self._evaluate()
2021-11-25 14:03:46.470649 [6361cf82]   File "/home/filco306/.envs/libkge-env/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
2021-11-25 14:03:46.470651 [6361cf82]     return func(*args, **kwargs)
2021-11-25 14:03:46.470654 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/job/eval_entity_ranking.py", line 207, in _evaluate
2021-11-25 14:03:46.470656 [6361cf82]     scores = self.model.score_sp_po(
2021-11-25 14:03:46.470659 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/model/reciprocal_relations_model.py", line 98, in score_sp_po
2021-11-25 14:03:46.470661 [6361cf82]     sp_scores = self._scorer.score_emb(s, p, all_entities, combine="sp_")
2021-11-25 14:03:46.470664 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/model/rotate.py", line 50, in score_emb
2021-11-25 14:03:46.470666 [6361cf82]     diff_abs = abs_complex(diff_re, diff_im)  # sp x o x dim
2021-11-25 14:03:46.470669 [6361cf82] RuntimeError: The following operation failed in the TorchScript interpreter.
2021-11-25 14:03:46.470671 [6361cf82] Traceback of TorchScript (most recent call last):
2021-11-25 14:03:46.470674 [6361cf82]   File "/home/filco306/lib-kge-fork/kge/model/rotate.py", line 201, in abs_complex
2021-11-25 14:03:46.470676 [6361cf82]     "Compute magnitude of given complex numbers"
2021-11-25 14:03:46.470678 [6361cf82]     x_re_im = torch.stack((x_re, x_im), dim=0)  # dim0: real, imaginary
2021-11-25 14:03:46.470681 [6361cf82]     return torch.norm(x_re_im, dim=0)  # sqrt(real^2+imaginary^2)
2021-11-25 14:03:46.470684 [6361cf82]            ~~~~~~~~~~ <--- HERE
2021-11-25 14:03:46.470686 [6361cf82]   File "/home/filco306/.envs/libkge-env/lib/python3.8/site-packages/torch/functional.py", line 1333, in norm
2021-11-25 14:03:46.470688 [6361cf82]                 _dim = list(range(ndim))
2021-11-25 14:03:46.470691 [6361cf82]             if out is None:
2021-11-25 14:03:46.470693 [6361cf82]                 return _VF.frobenius_norm(input, _dim, keepdim=keepdim)
2021-11-25 14:03:46.470696 [6361cf82]                        ~~~~~~~~~~~~~~~~~~ <--- HERE
2021-11-25 14:03:46.470698 [6361cf82]             else:
2021-11-25 14:03:46.470701 [6361cf82]                 return _VF.frobenius_norm(input, _dim, keepdim=keepdim, out=out)
2021-11-25 14:03:46.470703 [6361cf82] RuntimeError: CUDA out of memory. Tried to allocate 2.51 GiB (GPU 0; 15.72 GiB total capacity; 12.63 GiB already allocated; 1.95 GiB free; 12.71 GiB reserved in total by PyTorch)
2021-11-25 14:03:46.470706 [6361cf82]

uma-pi1 / kge

Negative sampling still does KvsAll #250