Closed JohannesMaxWel closed 6 years ago
Weird. Works normally for me on a fresh version of master. You sure you on the latest master without changes?
@JohannesMaxWel it works perfectly for me as well, maybe you forgot the PYTHONPATH=.
prefix:
$ PYTHONPATH=. python3 bin/jack-train.py with config='./conf/nli/snli/esim.yaml'
WARNING - jack - No observers have been added to this run
INFO - jack - Running command 'run'
INFO - jack - Started
INFO - jack-train.py - TRAINING
WARNING - root - Changed type of config entry "parent_config" from str to DogmaticList
INFO - jack - Running command 'print_config'
INFO - jack - Started
Configuration (modified, added, typechanged, doc):
batch_size = 32
clip_value = 0.0
config = './conf/nli/snli/esim.yaml'
[..]
model:
encoder_layer = [{'activation': 'tanh',
'dropout': True,
'input': 'hypothesis',
'module': 'lstm',
'name': 'encoder',
'with_projection': True},
{'activation': 'tanh',
'dropout': True,
'input': 'premise',
'module': 'lstm',
'name': 'encoder',
'with_projection': True},
{'attn_type': 'dot',
'concat': False,
'dependent': 'hypothesis',
'input': 'premise',
'module': 'attention_matching',
'output': 'hypothesis_attn'},
{'attn_type': 'dot',
'concat': False,
'dependent': 'premise',
'input': 'hypothesis',
'module': 'attention_matching',
'output': 'premise_attn'},
{'input': ['premise', 'hypothesis_attn'],
'module': 'mul',
'output': 'premise_mul'},
{'input': ['premise', 'hypothesis_attn'],
'module': 'sub',
'output': 'premise_sub'},
{'input': ['premise', 'hypothesis_attn', 'premise_mul', 'premise_sub'],
'module': 'concat',
'output': 'premise'},
{'activation': 'relu',
'dropout': True,
'input': 'premise',
'module': 'dense',
'name': 'projection'},
{'input': ['hypothesis', 'premise_attn'],
'module': 'mul',
'output': 'hypothesis_mul'},
{'input': ['hypothesis', 'premise_attn'],
'module': 'sub',
'output': 'hypothesis_sub'},
{'input': ['hypothesis', 'premise_attn', 'hypothesis_mul', 'hypothesis_sub'],
'module': 'concat',
'output': 'hypothesis'},
{'activation': 'relu',
'dropout': True,
'input': 'hypothesis',
'module': 'dense',
'name': 'projection'},
{'input': 'hypothesis', 'module': 'lstm', 'name': 'composition'},
{'input': 'premise', 'module': 'lstm', 'name': 'composition'}]
prediction_layer:
dropout = True
module = 'max_avg_mlp'
INFO - jack - Completed after 0:00:00
INFO - jack-train.py - JACK_TEMP not set, setting it to /tmp/jack/bd806ec2-0831-4cc0-b644-7f8118156118. Might be used for caching.
INFO - jack-train.py - loaded train/dev/test data
INFO - jack-train.py - loaded pre-trained embeddings (data/GloVe/glove.840B.300d.memory_map_dir)
2018-07-15 16:21:59.968366: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-07-15 16:22:00.048260: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:897] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-07-15 16:22:00.048649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties:
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate(GHz): 1.531
pciBusID: 0000:02:00.0
totalMemory: 11.91GiB freeMemory: 11.53GiB
2018-07-15 16:22:00.048661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2018-07-15 16:22:00.189067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-15 16:22:00.189092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0
2018-07-15 16:22:00.189097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N
2018-07-15 16:22:00.189301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11163 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:02:00.0, compute capability: 6.1)
INFO - jack.core.reader - Setting up model...
INFO - jack.core.reader - Preparing training data...
INFO - jack.core.input_module - OnlineInputModule pre-processes data on-the-fly in first epoch and caches results for subsequent epochs! That means, first epoch might be slower.
INFO - jack.core.reader - Number of parameters: 2704203
INFO - jack.core.reader - Start training...
Training ESIM or DAM currently does not work.
Input: (without further modification to the default config)
yields this error
Same for DAM.