NTMC-Community / MatchZoo

Facilitating the design, comparison and sharing of deep text matching models.
Apache License 2.0
3.84k stars 900 forks source link

Segmentation fault running DSSM on another dataset #37

Closed levyfan closed 6 years ago

levyfan commented 6 years ago
python matchzoo/main.py --phase train --model_file examples/config/dssm_ranking.config 
Using TensorFlow backend.
2018-01-08 11:47:26.702599: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.2 AVX
{
  "inputs": {
    "test": {
      "phase": "EVAL", 
      "input_type": "Triletter_ListGenerator", 
      "batch_list": 10, 
      "relation_file": "./data/relation_test.txt", 
      "dtype": "dssm"
    }, 
    "predict": {
      "phase": "PREDICT", 
      "input_type": "Triletter_ListGenerator", 
      "batch_list": 10, 
      "relation_file": "./data/relation_test.txt", 
      "dtype": "dssm"
    }, 
    "train": {
      "relation_file": "./data/relation_train.txt", 
      "input_type": "Triletter_PairGenerator", 
      "batch_size": 100, 
      "batch_per_iter": 5, 
      "dtype": "dssm", 
      "phase": "TRAIN", 
      "query_per_iter": 3, 
      "use_iter": true
    }, 
    "share": {
      "vocab_size": 3484, 
      "embed_size": 10, 
      "target_mode": "ranking", 
      "text1_corpus": "./data/corpus_preprocessed.txt", 
      "text2_corpus": "./data/corpus_preprocessed.txt", 
      "word_triletter_map_file": "./data/word_triletter_map.txt"
    }, 
    "valid": {
      "phase": "EVAL", 
      "input_type": "Triletter_ListGenerator", 
      "batch_list": 10, 
      "relation_file": "./data/relation_valid.txt", 
      "dtype": "dssm"
    }
  }, 
  "global": {
    "optimizer": "adam", 
    "num_iters": 10, 
    "save_weights_iters": 10, 
    "learning_rate": 0.0001, 
    "test_weights_iters": 10, 
    "weights_file": "examples/weights/dssm_ranking.weights", 
    "model_type": "PY", 
    "display_interval": 10
  }, 
  "outputs": {
    "predict": {
      "save_format": "TREC", 
      "save_path": "predict.test.dssm_ranking.txt"
    }
  }, 
  "losses": [
    {
      "object_name": "rank_hinge_loss", 
      "object_params": {
        "margin": 1.0
      }
    }
  ], 
  "metrics": [
    "ndcg@3", 
    "ndcg@5", 
    "map"
  ], 
  "net_name": "dssm", 
  "model": {
    "model_py": "dssm.DSSM", 
    "setting": {
      "dropout_rate": 0.5, 
      "hidden_sizes": [
        100, 
        30
      ]
    }, 
    "model_path": "matchzoo/models/"
  }
}
[Embedding] Embedding Load Done.
[Input] Process Input Tags. [u'train'] in TRAIN, [u'test', u'valid'] in EVAL.
[./data/corpus_preprocessed.txt]
        Data size: 71849
[Dataset] 1 Dataset Load Done.
{u'relation_file': u'./data/relation_train.txt', u'vocab_size': 3484, u'embed_size': 10, u'target_mode': u'ranking', u'input_type': u'Triletter_PairGenerator', u'text1_corpus': u'./data/corpus_preprocessed.txt', u'batch_size': 100, u'batch_per_iter': 5, u'text2_corpus': u'./data/corpus_preprocessed.txt', u'word_triletter_map_file': u'./data/word_triletter_map.txt', u'dtype': u'dssm', u'phase': u'TRAIN', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ..., -0.13666791,
         0.00907838,  0.13784599],
       [ 0.03368587,  0.13503729,  0.00107509, ...,  0.18584302,
         0.03414046, -0.14042418],
       [ 0.03610065,  0.19066425,  0.11800677, ...,  0.14983599,
        -0.09182639, -0.0633784 ],
       ..., 
       [ 0.1179866 , -0.19746014,  0.08622313, ..., -0.02868197,
        -0.07183626,  0.06968395],
       [-0.02044802,  0.17994043, -0.0810562 , ...,  0.03050527,
         0.03873055, -0.14228183],
       [ 0.04971068,  0.16548306,  0.08958763, ...,  0.0537957 ,
         0.04853643,  0.09921838]], dtype=float32), u'query_per_iter': 3, u'use_iter': True}
[./data/relation_train.txt]
        Instance size: 32953
[Triletter_PairGenerator] init done
{u'relation_file': u'./data/relation_test.txt', u'vocab_size': 3484, u'embed_size': 10, u'target_mode': u'ranking', u'input_type': u'Triletter_ListGenerator', u'batch_list': 10, u'text1_corpus': u'./data/corpus_preprocessed.txt', u'text2_corpus': u'./data/corpus_preprocessed.txt', u'word_triletter_map_file': u'./data/word_triletter_map.txt', u'dtype': u'dssm', u'phase': u'EVAL', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ..., -0.13666791,
         0.00907838,  0.13784599],
       [ 0.03368587,  0.13503729,  0.00107509, ...,  0.18584302,
         0.03414046, -0.14042418],
       [ 0.03610065,  0.19066425,  0.11800677, ...,  0.14983599,
        -0.09182639, -0.0633784 ],
       ..., 
       [ 0.1179866 , -0.19746014,  0.08622313, ..., -0.02868197,
        -0.07183626,  0.06968395],
       [-0.02044802,  0.17994043, -0.0810562 , ...,  0.03050527,
         0.03873055, -0.14228183],
       [ 0.04971068,  0.16548306,  0.08958763, ...,  0.0537957 ,
         0.04853643,  0.09921838]], dtype=float32)}
[./data/relation_test.txt]
        Instance size: 25535
List Instance Count: 1445
[Triletter_ListGenerator] init done
{u'relation_file': u'./data/relation_valid.txt', u'vocab_size': 3484, u'embed_size': 10, u'target_mode': u'ranking', u'input_type': u'Triletter_ListGenerator', u'batch_list': 10, u'text1_corpus': u'./data/corpus_preprocessed.txt', u'text2_corpus': u'./data/corpus_preprocessed.txt', u'word_triletter_map_file': u'./data/word_triletter_map.txt', u'dtype': u'dssm', u'phase': u'EVAL', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ..., -0.13666791,
         0.00907838,  0.13784599],
       [ 0.03368587,  0.13503729,  0.00107509, ...,  0.18584302,
         0.03414046, -0.14042418],
       [ 0.03610065,  0.19066425,  0.11800677, ...,  0.14983599,
        -0.09182639, -0.0633784 ],
       ..., 
       [ 0.1179866 , -0.19746014,  0.08622313, ..., -0.02868197,
        -0.07183626,  0.06968395],
       [-0.02044802,  0.17994043, -0.0810562 , ...,  0.03050527,
         0.03873055, -0.14228183],
       [ 0.04971068,  0.16548306,  0.08958763, ...,  0.0537957 ,
         0.04853643,  0.09921838]], dtype=float32)}
[./data/relation_valid.txt]
        Instance size: 24919
List Instance Count: 1443
[Triletter_ListGenerator] init done
[DSSM] init done
[layer]: Input  [shape]: [None, 3484] 
 [Memory] Total Memory Use: 249.0977 MB          Resident: 261197824 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Input  [shape]: [None, 3484] 
 [Memory] Total Memory Use: 249.1133 MB          Resident: 261214208 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: MLP    [shape]: [None, 30] 
 [Memory] Total Memory Use: 250.2773 MB          Resident: 262434816 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: MLP    [shape]: [None, 30] 
 [Memory] Total Memory Use: 250.5195 MB          Resident: 262688768 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Dot    [shape]: [None, 1] 
 [Memory] Total Memory Use: 250.6992 MB          Resident: 262877184 Shared: 0 UnshareData: 0 UnshareStack: 0 
[Model] Model Compile Done.
Segmentation fault: 11
uduse commented 6 years ago

What's your operating system and how big's your memory?

levyfan commented 6 years ago

mac os 8G RAM

uduse commented 6 years ago

Can you consistently reproduce the problem? and check your memory usage while running the script. Maybe you're running out of your memory.

levyfan commented 6 years ago

I observed that the memory consumes less than 1G (from 3G raise to <4G). The segmentation fault can be reproduced consistently.

levyfan commented 6 years ago

The generated dataset is attached below. The ranking config is same as toy_example/config/dssm_ranking.config (except for the data path).

corpus.txt corpus_preprocessed.txt relation_test.txt relation_train.txt relation_valid.txt triletter_dict.txt word_dict.txt word_stats.txt word_triletter_map.txt

uduse commented 6 years ago

Try this, see if you can hunt down the exact line that's causing the problem?

levyfan commented 6 years ago
Process 5098 stopped
* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x11a5ab35c)
    frame #0: 0x000000010ee628da _sparsetools.so`csr_todense_thunk(int, int, void**) + 2282
_sparsetools.so`csr_todense_thunk:
->  0x10ee628da <+2282>: addss  (%rdx,%rsi,4), %xmm0
    0x10ee628df <+2287>: movss  %xmm0, (%rdx,%rsi,4)
    0x10ee628e4 <+2292>: addq   $0x8, %rcx
    0x10ee628e8 <+2296>: addq   $0x8, %rbx
Target 0: (python) stopped.
(lldb) bt
* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x11a5ab35c)
  * frame #0: 0x000000010ee628da _sparsetools.so`csr_todense_thunk(int, int, void**) + 2282
    frame #1: 0x000000010ee5c418 _sparsetools.so`call_thunk(char, char const*, long (*)(int, int, void**), _object*) + 2456
    frame #2: 0x00007fff4f735f89 Python`PyEval_EvalFrameEx + 2917
    frame #3: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
    frame #4: 0x00007fff4f73b2b4 Python`___lldb_unnamed_symbol1476$$Python + 290
    frame #5: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
    frame #6: 0x00007fff4f6d40df Python`___lldb_unnamed_symbol419$$Python + 182
    frame #7: 0x00007fff4f732b00 Python`___lldb_unnamed_symbol1446$$Python + 140
    frame #8: 0x00007fff4f735f89 Python`PyEval_EvalFrameEx + 2917
    frame #9: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
    frame #10: 0x00007fff4f73b2b4 Python`___lldb_unnamed_symbol1476$$Python + 290
    frame #11: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
    frame #12: 0x00007fff4f6d40df Python`___lldb_unnamed_symbol419$$Python + 182
    frame #13: 0x00007fff4f732b00 Python`___lldb_unnamed_symbol1446$$Python + 140
    frame #14: 0x00007fff4f735f89 Python`PyEval_EvalFrameEx + 2917
    frame #15: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
    frame #16: 0x00007fff4f6dc935 Python`___lldb_unnamed_symbol510$$Python + 327
    frame #17: 0x00007fff4f6bf581 Python`PyObject_Call + 97
    frame #18: 0x00007fff4f738f2a Python`PyEval_EvalFrameEx + 15110
    frame #19: 0x00007fff4f73b256 Python`___lldb_unnamed_symbol1476$$Python + 196
    frame #20: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
    frame #21: 0x00007fff4f73b256 Python`___lldb_unnamed_symbol1476$$Python + 196
    frame #22: 0x00007fff4f735b45 Python`PyEval_EvalFrameEx + 1825
    frame #23: 0x00007fff4f735232 Python`PyEval_EvalCodeEx + 1551
    frame #24: 0x00007fff4f6dc935 Python`___lldb_unnamed_symbol510$$Python + 327
    frame #25: 0x00007fff4f6bf581 Python`PyObject_Call + 97
    frame #26: 0x00007fff4f6c9c9e Python`___lldb_unnamed_symbol192$$Python + 163
    frame #27: 0x00007fff4f6bf581 Python`PyObject_Call + 97
    frame #28: 0x00007fff4f73abfe Python`PyEval_CallObjectWithKeywords + 159
    frame #29: 0x00007fff4f766afb Python`___lldb_unnamed_symbol1725$$Python + 70
    frame #30: 0x00007fff6cd596c1 libsystem_pthread.dylib`_pthread_body + 340
    frame #31: 0x00007fff6cd5956d libsystem_pthread.dylib`_pthread_start + 377
    frame #32: 0x00007fff6cd58c5d libsystem_pthread.dylib`thread_start + 13
levyfan commented 6 years ago

It crashes at main.py 146 history = model.fit_generator(...)

levyfan commented 6 years ago

I put the code on linux machine and the exception is

[Model] Model Compile Done.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/local/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.7/site-packages/keras/utils/data_utils.py", line 568, in data_generator_task
    generator_output = next(self._generator)
  File "/home/fanliwen/MatchZoo/matchzoo/inputs/pair_generator.py", line 283, in get_batch_generator
    X1, X1_len, X2, X2_len, Y = self.get_batch()
  File "/home/fanliwen/MatchZoo/matchzoo/inputs/pair_generator.py", line 83, in get_batch
    return next(self.batch_iter)
  File "/home/fanliwen/MatchZoo/matchzoo/inputs/pair_generator.py", line 276, in get_batch_iter
    yield self.transfer_feat2sparse(X1).toarray(), X1_len, self.transfer_feat2sparse(X2).toarray(), X2_len, Y
  File "/usr/local/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 964, in toarray
    return self.tocoo(copy=False).toarray(order=order, out=out)
  File "/usr/local/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 958, in tocoo
    dtype=self.dtype)
  File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 184, in __init__
    self._check()
  File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 232, in _check
    raise ValueError('column index exceeds matrix dimensions')
ValueError: column index exceeds matrix dimensions

[01-09-2018 19:47:26]   [Train:train] Traceback (most recent call last):
  File "matchzoo/main.py", line 328, in <module>
    main(sys.argv)
  File "matchzoo/main.py", line 320, in main
    train(config)
  File "matchzoo/main.py", line 151, in train
    verbose = 0
  File "/usr/local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/keras/engine/training.py", line 2015, in fit_generator
    generator_output = next(output_generator)
StopIteration
uduse commented 6 years ago

@levyfan some kind of out-bound problem when indexing matrices, but I'm not familiar with the MatchZoo iterators. Maybe @faneshion can help.

SeekPoint commented 6 years ago

my solution:

set one bigger than your original setting of the vocab_size in config

thiziri commented 6 years ago

Hello, I'm getting the same error while running with AP88 TREC dataset. Actually, I'm trying to run MZ models using TREC datasets, in a SLURM server but I've got the same problem with ARC_I, ARC_II, CDSSM, DRMM_TKS .. and the allocated memory is not completely used. Here is what printed on the screen:

{
  "model": {
    "model_py": "arci.ARCI",
    "model_path": "matchzoo/models/",
    "setting": {
      "dropout_rate": 0.5,
      "kernel_count": 8,
      "kernel_size": 3,
      "d_pool_size": 2,
      "q_pool_size": 2
    }
  },
  "losses": [
    {
      "object_params": {
        "margin": 0.5
      },
      "object_name": "rank_hinge_loss"
    }
  ],
  "global": {
    "model_type": "PY",
    "learning_rate": 0.0001,
    "optimizer": "adam",
    "weights_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/weights/arci_ranking.weights",
    "num_iters": 10,
    "save_weights_iters": 10,
    "display_interval": 10,
    "test_weights_iters": 10
  },
  "metrics": [
    "precision@10",
    "ndcg@10",
    "ndcg@20",
    "map"
  ],
  "net_name": "ARCI",
  "outputs": {
    "predict": {
      "save_format": "TREC",
      "save_path": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/predictions/from_qrels/predict.test.arci_ranking.txt"
    }
  },
  "inputs": {
    "share": {
      "train_embed": false,
      "text1_corpus": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt",
      "use_dpool": false,
      "embed_size": 300,
      "text2_maxlen": 1000,
      "text2_corpus": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt",
      "vocab_size": 129897,
      "target_mode": "ranking",
      "text1_maxlen": 20
    },
    "train": {
      "use_iter": false,
      "batch_size": 100,
      "query_per_iter": 50,
      "batch_per_iter": 5,
      "input_type": "PairGenerator",
      "relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt",
      "phase": "TRAIN"
    },
    "predict": {
      "batch_list": 10,
      "relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt",
      "input_type": "ListGenerator",
      "phase": "PREDICT"
    },
    "test": {
      "batch_list": 10,
      "relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt",
      "input_type": "ListGenerator",
      "phase": "EVAL"
    },
    "valid": {
      "batch_list": 10,
      "relation_file": "/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt",
      "input_type": "ListGenerator",
      "phase": "EVAL"
    }
  }
}
[Embedding] Embedding Load Done.
[Input] Process Input Tags. odict_keys(['train']) in TRAIN, odict_keys(['test', 'valid']) in EVAL.
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt]
    Data size: 79969
[Dataset] 1 Dataset Load Done.
{'text1_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'use_dpool': False, 'batch_size': 100, 'embed_size': 300, 'text2_maxlen': 1000, 'input_type': 'PairGenerator', 'relation_file': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ...,  0.04232861,
         0.16873358, -0.1632563 ],
       [ 0.04360746,  0.02268181,  0.13736159, ..., -0.04956975,
        -0.18725845, -0.19015439],
       [-0.07373005, -0.04657853,  0.0677646 , ...,  0.00168478,
         0.03469655,  0.12419996],
       ...,
       [-0.04969991, -0.00968194, -0.1472602 , ..., -0.07864611,
         0.11010233,  0.15707028],
       [-0.169353  , -0.07957499, -0.00709578, ..., -0.07572405,
         0.06080896,  0.19945614],
       [ 0.16906822, -0.16493008,  0.07978389, ...,  0.00874102,
         0.05448175,  0.10033885]], dtype=float32), 'train_embed': False, 'use_iter': False, 'text1_maxlen': 20, 'batch_per_iter': 5, 'query_per_iter': 50, 'text2_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'target_mode': 'ranking', 'vocab_size': 129897, 'phase': 'TRAIN'}
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt]
    Instance size: 3196760
Pair Instance Count: 85144090
[PairGenerator] init done
{'text1_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'input_type': 'ListGenerator', 'use_dpool': False, 'embed_size': 300, 'text2_maxlen': 1000, 'batch_list': 10, 'relation_file': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ...,  0.04232861,
         0.16873358, -0.1632563 ],
       [ 0.04360746,  0.02268181,  0.13736159, ..., -0.04956975,
        -0.18725845, -0.19015439],
       [-0.07373005, -0.04657853,  0.0677646 , ...,  0.00168478,
         0.03469655,  0.12419996],
       ...,
       [-0.04969991, -0.00968194, -0.1472602 , ..., -0.07864611,
         0.11010233,  0.15707028],
       [-0.169353  , -0.07957499, -0.00709578, ..., -0.07572405,
         0.06080896,  0.19945614],
       [ 0.16906822, -0.16493008,  0.07978389, ...,  0.00874102,
         0.05448175,  0.10033885]], dtype=float32), 'train_embed': False, 'text1_maxlen': 20, 'text2_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'target_mode': 'ranking', 'vocab_size': 129897, 'phase': 'EVAL'}
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_test.txt]
    Instance size: 399595
List Instance Count: 50
[ListGenerator] init done
{'text1_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'input_type': 'ListGenerator', 'use_dpool': False, 'embed_size': 300, 'text2_maxlen': 1000, 'batch_list': 10, 'relation_file': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt', 'embed': array([[-0.18291523, -0.00574826, -0.13887608, ...,  0.04232861,
         0.16873358, -0.1632563 ],
       [ 0.04360746,  0.02268181,  0.13736159, ..., -0.04956975,
        -0.18725845, -0.19015439],
       [-0.07373005, -0.04657853,  0.0677646 , ...,  0.00168478,
         0.03469655,  0.12419996],
       ...,
       [-0.04969991, -0.00968194, -0.1472602 , ..., -0.07864611,
         0.11010233,  0.15707028],
       [-0.169353  , -0.07957499, -0.00709578, ..., -0.07572405,
         0.06080896,  0.19945614],
       [ 0.16906822, -0.16493008,  0.07978389, ...,  0.00874102,
         0.05448175,  0.10033885]], dtype=float32), 'train_embed': False, 'text1_maxlen': 20, 'text2_corpus': '/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/corpus_preprocessed.txt', 'target_mode': 'ranking', 'vocab_size': 129897, 'phase': 'EVAL'}
[/projets/iris/PROJETS/WEIR/code/2ndYear/MatchZoo/my_tests/custom_test/data/AP88/from_qrels/relation_train.txt]
    Instance size: 3196760
List Instance Count: 50
[ListGenerator] init done
[ARCI] init done
[layer]: Input  [shape]: [None, 20] 
 [Memory] Total Memory Use: 9417.4688 MB    Resident: 9643488 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Input  [shape]: [None, 1000] 
 [Memory] Total Memory Use: 9417.4688 MB    Resident: 9643488 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Embedding  [shape]: [None, 20, 300] 
 [Memory] Total Memory Use: 10011.7578 MB   Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Embedding  [shape]: [None, 1000, 300] 
 [Memory] Total Memory Use: 10011.7578 MB   Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Conv1D [shape]: [None, 20, 8] 
 [Memory] Total Memory Use: 10011.7578 MB   Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 
[layer]: Conv1D [shape]: [None, 1000, 8] 
 [Memory] Total Memory Use: 10011.7578 MB   Resident: 10252040 Shared: 0 UnshareData: 0 UnshareStack: 0 
srun: error: 64cpu-nc01: task 0: Segmentation fault
geekan commented 6 years ago

@loveJasmine it works!

bwanglzu commented 6 years ago

if you're working on DSSM, please try out our newly released Matchzoo 2.0 here.