THUNLP-MT / THUMT

An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
BSD 3-Clause "New" or "Revised" License
705 stars 197 forks source link

Is it possible to use your THUMT to take part in a tranlation cometition? #5

Closed xudekuan closed 6 years ago

xudekuan commented 7 years ago

Is it possible to use your THUMT to take part in a tranlation cometition?

Glaceon31 commented 7 years ago

Of course! Hope you get a good result :-)

xudekuan commented 7 years ago

Thank you very much! When we used it on GPU, erros occuered:

Using gpu device 0: TITAN Xp (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 6021) /usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/init.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5. warnings.warn(warn) [13 Sep 15:57:39 INFO] =====config===== [13 Sep 15:57:39 INFO] "maxout": 2 [13 Sep 15:57:39 INFO] "index_unk_trg": 1 [13 Sep 15:57:39 INFO] "num_vocab_src": 30001 [13 Sep 15:57:39 INFO] "try_iter": 100000 [13 Sep 15:57:39 INFO] "clip": 1.0 [13 Sep 15:57:39 INFO] "sort_batches": 20 [13 Sep 15:57:39 INFO] "trg_mono_shuf": [13 Sep 15:57:39 INFO] "batchsize": 80 [13 Sep 15:57:39 INFO] "sampleN": 100 [13 Sep 15:57:39 INFO] "src_text": data/guwentrain.src [13 Sep 15:57:39 INFO] "index_eos_src": 30000 [13 Sep 15:57:39 INFO] "trg_mono": [13 Sep 15:57:39 INFO] "auto_lambda_1": 1 [13 Sep 15:57:39 INFO] "MRT_alpha": 0.005 [13 Sep 15:57:39 INFO] "dim_emb_trg": 620 [13 Sep 15:57:39 INFO] "beta2_adam": 0.999 [13 Sep 15:57:39 INFO] "test_ref": [] [13 Sep 15:57:39 INFO] "auto_lambda_2": 10 [13 Sep 15:57:39 INFO] "trg_text": data/guwentrain.trg [13 Sep 15:57:39 INFO] "checkpoint_model": checkpoint_model.npz [13 Sep 15:57:39 INFO] "sample_length": 50 [13 Sep 15:57:39 INFO] "semi_sampleN": 10 [13 Sep 15:57:39 INFO] "alphadecay_adam": 0.998 [13 Sep 15:57:39 INFO] "save_freq": 2000 [13 Sep 15:57:39 INFO] "reconstruct_lambda": 0.1 [13 Sep 15:57:39 INFO] "n_samples": 1 [13 Sep 15:57:39 INFO] "lr": 1.0 [13 Sep 15:57:39 INFO] "save_path": models [13 Sep 15:57:39 INFO] "dim_rec_enc": 1000 [13 Sep 15:57:39 INFO] "eps_adam": 1e-08 [13 Sep 15:57:39 INFO] "save": True [13 Sep 15:57:39 INFO] "src_mono": [13 Sep 15:57:39 INFO] "data_corpus": json [13 Sep 15:57:39 INFO] "src_mono_shuf": [13 Sep 15:57:39 INFO] "alpha_adam": 0.0005 [13 Sep 15:57:39 INFO] "valid_dir": validation [13 Sep 15:57:39 INFO] "optimizer": adam_slowstart [13 Sep 15:57:39 INFO] "sample_sentence": [13 Sep 15:57:39 INFO] "dim_emb_src": 620 [13 Sep 15:57:39 INFO] "MRT": False [13 Sep 15:57:39 INFO] "epsilon": 1e-06 [13 Sep 15:57:39 INFO] "max_iter": 1000000 [13 Sep 15:57:39 INFO] "data_vocab": cPickle [13 Sep 15:57:39 INFO] "index_unk_src": 1 [13 Sep 15:57:39 INFO] "valid_src": data/guwenvalid.src [13 Sep 15:57:39 INFO] "src_shuf": corpus/train.zh.json.shuf [13 Sep 15:57:39 INFO] "trg_shuf": corpus/train.en.json.shuf [13 Sep 15:57:39 INFO] "LenRatio": 1.5 [13 Sep 15:57:39 INFO] "rho": 0.95 [13 Sep 15:57:39 INFO] "ivocab_src": corpus/ivocab.zh.pkl [13 Sep 15:57:39 INFO] "checkpoint_freq": 2000 [13 Sep 15:57:39 INFO] "sample_times": 1 [13 Sep 15:57:39 INFO] "trg_mono_text": [13 Sep 15:57:39 INFO] "dim_rec_dec": 1000 [13 Sep 15:57:39 INFO] "src": corpus/train.zh.json [13 Sep 15:57:39 INFO] "src_mono_text": [13 Sep 15:57:39 INFO] "test_src": [] [13 Sep 15:57:39 INFO] "beta1_adam": 0.9 [13 Sep 15:57:39 INFO] "semi_learning": False [13 Sep 15:57:39 INFO] "index_eos_trg": 30000 [13 Sep 15:57:39 INFO] "sample_freq": 100 [13 Sep 15:57:39 INFO] "valid_ref": data/guwenvalid.trg [13 Sep 15:57:39 INFO] "verbose_level": info [13 Sep 15:57:39 INFO] "num_vocab_trg": 30001 [13 Sep 15:57:39 INFO] "trg": corpus/train.en.json [13 Sep 15:57:39 INFO] "ivocab_trg": corpus/ivocab.en.pkl [13 Sep 15:57:39 INFO] "test_dir": eval [13 Sep 15:57:39 INFO] "beam_size": 10 [13 Sep 15:57:39 INFO] "checkpoint_status": checkpoint_status.pkl [13 Sep 15:57:39 INFO] "vocab_trg": corpus/vocab.en.pkl [13 Sep 15:57:39 INFO] "init_model": [13 Sep 15:57:39 INFO] "maxlength": 50 [13 Sep 15:57:39 INFO] "model": RNNsearch [13 Sep 15:57:39 INFO] "vocab_src": corpus/vocab.zh.pkl [13 Sep 15:57:39 INFO] "sample_num": 10 [13 Sep 15:57:39 INFO] [13 Sep 15:57:39 INFO] STEP 2: Training [13 Sep 15:57:39 INFO] STEP 2.1: Loading training data [13 Sep 15:57:39 INFO] total 12 sentences [13 Sep 15:57:39 INFO] Discarding long sentences. 12 sentences left. [13 Sep 15:57:39 INFO] Done!

[13 Sep 15:57:39 INFO] STEP 2.2: Building model [13 Sep 15:57:39 INFO] Initializing layers [13 Sep 15:57:45 INFO] Building computational graph [13 Sep 15:57:45 INFO] Done!

[13 Sep 15:57:45 INFO] STEP 2.3: Building optimizer 1 #include 2 #include 3 #include "theano_mod_helper.h" 4 #include "cuda_ndarray.cuh" 5 #include "cudnn.h" 6 #include "cudnn_helper.h" 7 ////////////////////// 8 //// Support Code 9 ////////////////////// 10 11 12 static cudnnHandle_t _handle = NULL; 13 14 15 static int 16 c_set_tensorNd(CudaNdarray var, cudnnTensorDescriptor_t desc) { 17 18 int dim = CudaNdarray_NDIM(var); 19 int strides = (int )malloc(dim sizeof(int)); 20 int default_str = 1; 21 int return_value = 0; 22
23 if (strides != NULL) { 24 for (int i = dim-1; i >= 0; i--) 25 { 26 if (CudaNdarray_HOST_STRIDES(var)[i]) 27 strides[i] = CudaNdarray_HOST_STRIDES(var)[i]; 28 else 29 strides[i] = default_str; 30 default_str = CudaNdarray_HOST_DIMS(var)[i]; 31 } 32
33 cudnnStatus_t err = cudnnSetTensorNdDescriptor(desc, CUDNN_DATA_FLOAT, dim, 34 CudaNdarray_HOST_DIMS(var), 35 strides); 36
37
38 if (err != CUDNN_STATUS_SUCCESS) { 39 PyErr_Format(PyExc_RuntimeError, 40 "Could not set tensorNd descriptor: %s" 41 "dim=%d", 42 cudnnGetErrorString(err), dim); 43
44 return_value = -1; 45 } 46 } else { 47 PyErr_Format(PyExc_MemoryError, 48 "Could not allocate memory for strides array of size %d.", 49 dim); 50
51 return_value = -1;
52 } 53
54 free(strides); 55 return return_value; 56 } 57 58 59 static int 60 c_set_filterNd(CudaNdarray
var, cudnnFilterDescriptor_t desc) { 61 if (!CudaNdarray_is_c_contiguous(var)) { 62 PyErr_SetString(PyExc_ValueError, 63 "Only contiguous filters (kernels) are supported."); 64 return -1; 65 } 66 int dim = CudaNdarray_NDIM(var); 67 cudnnStatus_t err = cudnnSetFilterNdDescriptor_v4(desc, 68 CUDNN_DATA_FLOAT, 69 CUDNN_TENSOR_NCHW, 70 dim, 71 CudaNdarray_HOST_DIMS(var)); 72 if (err != CUDNN_STATUS_SUCCESS) { 73 PyErr_Format(PyExc_RuntimeError, 74 "Could not set filter descriptor: %s." 75 " dims= %d", 76 cudnnGetErrorString(err), dim); 77 return -1; 78 } 79 return 0; 80 } 81 82 83 84 namespace { 85 struct struct_compiled_op_6c2407cd47903371a6adb2201001b071 { 86 PyObject __ERROR; 87 88 PyObject storage_V3; 89 PyObject storage_V5; 90 PyObject storage_V1; 91
92 cudnnTensorDescriptor_t softmax_gout_node_6c2407cd47903371a6adb2201001b071_0; 93 94 cudnnTensorDescriptor_t softmax_input_node_6c2407cd47903371a6adb2201001b071_0; 95 96 cudnnTensorDescriptor_t softmax_output_node_6c2407cd47903371a6adb2201001b071_0; 97 98 99 __struct_compiled_op_6c2407cd47903371a6adb2201001b071() { 100 // This is only somewhat safe because we: 101 // 1) Are not a virtual class 102 // 2) Do not use any virtual classes in the members 103 // 3) Deal with mostly POD and pointers 104 105 // If this changes, we would have to revise this, but for 106 // now I am tired of chasing segfaults because 107 // initialization code had an error and some pointer has 108 // a junk value. 109 memset(this, 0, sizeof(*this)); 110 } 111 ~
struct_compiled_op_6c2407cd47903371a6adb2201001b071(void) { 112 cleanup(); 113 } 114 115 int init(PyObject __ERROR, PyObject storage_V3, PyObject storage_V5, PyObject storage_V1) { 116 Py_XINCREF(storage_V3); 117 Py_XINCREF(storage_V5); 118 Py_XINCREF(storage_V1); 119 this->storage_V3 = storage_V3; 120 this->storage_V5 = storage_V5; 121 this->storage_V1 = storage_V1; 122
123 124 125 126 127 cudnnStatus_t errnode_6c2407cd47903371a6adb2201001b071_0; 128 129 softmax_gout_node_6c2407cd47903371a6adb2201001b071_0 = NULL; 130 if ((errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnCreateTensorDescriptor(&softmax_gout_node_6c2407cd47903371a6adb2201001b071_0)) != CUDNN_STATUS_SUCCESS) { 131 PyErr_Format(PyExc_MemoryError, "could not allocate tensor descriptor " 132 ": %s", cudnnGetErrorString(errnode_6c2407cd47903371a6adb2201001b071_0)); 133 { 134 if (!PyErr_Occurred()) { 135 PyErr_SetString(PyExc_RuntimeError, 136 "Unexpected error in an Op's C code. " 137 "No Python exception was set."); 138 } 139 return 7; 140 } 141 } 142 143 softmax_input_node_6c2407cd47903371a6adb2201001b071_0 = NULL; 144 if ((errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnCreateTensorDescriptor(&softmax_input_node_6c2407cd47903371a6adb2201001b071_0)) != CUDNN_STATUS_SUCCESS) { 145 PyErr_Format(PyExc_MemoryError, "could not allocate tensor descriptor " 146 ": %s", cudnnGetErrorString(errnode_6c2407cd47903371a6adb2201001b071_0)); 147 { 148 if (!PyErr_Occurred()) { 149 PyErr_SetString(PyExc_RuntimeError, 150 "Unexpected error in an Op's C code. " 151 "No Python exception was set."); 152 } 153 return 7; 154 } 155 } 156 157 softmax_output_node_6c2407cd47903371a6adb2201001b071_0 = NULL; 158 if ((errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnCreateTensorDescriptor(&softmax_output_node_6c2407cd47903371a6adb2201001b071_0)) != CUDNN_STATUS_SUCCESS) { 159 PyErr_Format(PyExc_MemoryError, "could not allocate tensor descriptor " 160 ": %s", cudnnGetErrorString(errnode_6c2407cd47903371a6adb2201001b071_0)); 161 { 162 if (!PyErr_Occurred()) { 163 PyErr_SetString(PyExc_RuntimeError, 164 "Unexpected error in an Op's C code. " 165 "No Python exception was set."); 166 } 167 return 7; 168 } 169 } 170 171 this->ERROR = ERROR; 172 return 0; 173 } 174 void cleanup(void) { 175 label_1: 176 177 double __DUMMY_1; 178 label_3: 179 180 double DUMMY_3; 181 __label_5: 182 183 double DUMMY_5; 184 label_8: 185 186 if(softmax_gout_node_6c2407cd47903371a6adb2201001b071_0!= NULL) 187 cudnnDestroyTensorDescriptor(softmax_gout_node_6c2407cd47903371a6adb2201001b071_0); 188 189 if(softmax_input_node_6c2407cd47903371a6adb2201001b071_0!= NULL) 190 cudnnDestroyTensorDescriptor(softmax_input_node_6c2407cd47903371a6adb2201001b071_0); 191 192 if(softmax_output_node_6c2407cd47903371a6adb2201001b071_0!= NULL) 193 cudnnDestroyTensorDescriptor(softmax_output_node_6c2407cd47903371a6adb2201001b071_0); 194 195 double __DUMMY_8; 196 197 Py_XDECREF(this->storage_V3); 198 Py_XDECREF(this->storage_V5); 199 Py_XDECREF(this->storage_V1); 200 } 201 int run(void) { 202 int failure = 0; 203
204 PyObject py_V1; 205 CudaNdarray V1; 206 PyObject py_V3; 207 CudaNdarray V3; 208 PyObject py_V5; 209 CudaNdarray V5; 210 { 211 212 py_V1 = PyList_GET_ITEM(storage_V1, 0); 213 {Py_XINCREF(py_V1);} 214
215 if (py_V1 == Py_None) 216 { 217 V1 = NULL; 218 } 219 else 220 { 221
222 assert(py_V1->ob_refcnt >= 2); // There should be at least one ref from the container object, 223 // and one ref from the local scope. 224 225 if (CudaNdarray_Check(py_V1)) 226 { 227 //fprintf(stderr, "c_extract CNDA object w refcnt %p %i\n", py_V1, (py_V1->ob_refcnt)); 228 V1 = (CudaNdarray)py_V1; 229 //std::cerr << "c_extract " << V1 << '\n'; 230
231 232 if (V1->nd != 4) 233 { 234 PyErr_Format(PyExc_RuntimeError, 235 "c_extract: Some CudaNdarray has rank %i, it was supposed to have rank 4", 236 V1->nd); 237 V1 = NULL; 238 { 239 failure = 2; 240 if (!PyErr_Occurred()) { 241 PyErr_SetString(PyExc_RuntimeError, 242 "Unexpected error in an Op's C code. " 243 "No Python exception was set."); 244 } 245 goto label_2;}; 246 } 247 //std::cerr << "c_extract " << V1 << " nd check passed\n"; 248
249 250 assert(V1); 251 Py_INCREF(py_V1); 252 } 253 else if (py_V1 == Py_None) 254 { 255 PyErr_SetString(PyExc_TypeError, 256 "expected a CudaNdarray, not None"); 257 V1 = NULL; 258 { 259
failure = 2; 260 if (!PyErr_Occurred()) { 261 PyErr_SetString(PyExc_RuntimeError, 262 "Unexpected error in an Op's C code. " 263 "No Python exception was set."); 264 } 265 goto
label_2;}; 266 } 267 else 268 { 269 //fprintf(stderr, "FAILING c_extract CNDA object w refcnt %p %i\n", py_V1, (py_V1->ob_refcnt)); 270 PyErr_SetString(PyExc_TypeError, "Argument not a CudaNdarray"); 271 V1 = NULL; 272 { 273 __failure = 2; 274 if (!PyErr_Occurred()) { 275 PyErr_SetString(PyExc_RuntimeError, 276 "Unexpected error in an Op's C code. " 277 "No Python exception was set."); 278 } 279 goto __label_2;}; 280 } 281 //std::cerr << "c_extract done " << V1 << '\n'; 282
283 284 } 285
286 { 287 288 py_V3 = PyList_GET_ITEM(storage_V3, 0); 289 {Py_XINCREF(py_V3);} 290
291 assert(py_V3->ob_refcnt >= 2); // There should be at least one ref from the container object, 292 // and one ref from the local scope. 293 294 if (CudaNdarray_Check(py_V3)) 295 { 296 //fprintf(stderr, "c_extract CNDA object w refcnt %p %i\n", py_V3, (py_V3->ob_refcnt)); 297 V3 = (CudaNdarray
)py_V3; 298 //std::cerr << "c_extract " << V3 << '\n'; 299
300 301 if (V3->nd != 4) 302 { 303 PyErr_Format(PyExc_RuntimeError, 304 "c_extract: Some CudaNdarray has rank %i, it was supposed to have rank 4", 305 V3->nd); 306 V3 = NULL; 307 { 308 failure = 4; 309 if (!PyErr_Occurred()) { 310 PyErr_SetString(PyExc_RuntimeError, 311 "Unexpected error in an Op's C code. " 312 "No Python exception was set."); 313 } 314 goto label_4;}; 315 } 316 //std::cerr << "c_extract " << V3 << " nd check passed\n"; 317
318 319 assert(V3); 320 Py_INCREF(py_V3); 321 } 322 else if (py_V3 == Py_None) 323 { 324 PyErr_SetString(PyExc_TypeError, 325 "expected a CudaNdarray, not None"); 326 V3 = NULL; 327 { 328
failure = 4; 329 if (!PyErr_Occurred()) { 330 PyErr_SetString(PyExc_RuntimeError, 331 "Unexpected error in an Op's C code. " 332 "No Python exception was set."); 333 } 334 goto
label_4;}; 335 } 336 else 337 { 338 //fprintf(stderr, "FAILING c_extract CNDA object w refcnt %p %i\n", py_V3, (py_V3->ob_refcnt)); 339 PyErr_SetString(PyExc_TypeError, "Argument not a CudaNdarray"); 340 V3 = NULL; 341 { 342 failure = 4; 343 if (!PyErr_Occurred()) { 344 PyErr_SetString(PyExc_RuntimeError, 345 "Unexpected error in an Op's C code. " 346 "No Python exception was set."); 347 } 348 goto __label_4;}; 349 } 350 //std::cerr << "c_extract done " << V3 << '\n'; 351
352 353 { 354 355 py_V5 = PyList_GET_ITEM(storage_V5, 0); 356 {Py_XINCREF(py_V5);} 357
358 assert(py_V5->ob_refcnt >= 2); // There should be at least one ref from the container object, 359 // and one ref from the local scope. 360 361 if (CudaNdarray_Check(py_V5)) 362 { 363 //fprintf(stderr, "c_extract CNDA object w refcnt %p %i\n", py_V5, (py_V5->ob_refcnt)); 364 V5 = (CudaNdarray*)py_V5; 365 //std::cerr << "c_extract " << V5 << '\n'; 366
367 368 if (V5->nd != 4) 369 { 370 PyErr_Format(PyExc_RuntimeError, 371 "c_extract: Some CudaNdarray has rank %i, it was supposed to have rank 4", 372 V5->nd); 373 V5 = NULL; 374 { 375
failure = 6; 376 if (!PyErr_Occurred()) { 377 PyErr_SetString(PyExc_RuntimeError, 378 "Unexpected error in an Op's C code. " 379 "No Python exception was set."); 380 } 381 goto label_6;}; 382 } 383 //std::cerr << "c_extract " << V5 << " nd check passed\n"; 384
385 386 assert(V5); 387 Py_INCREF(py_V5); 388 } 389 else if (py_V5 == Py_None) 390 { 391 PyErr_SetString(PyExc_TypeError, 392 "expected a CudaNdarray, not None"); 393 V5 = NULL; 394 { 395
failure = 6; 396 if (!PyErr_Occurred()) { 397 PyErr_SetString(PyExc_RuntimeError, 398 "Unexpected error in an Op's C code. " 399 "No Python exception was set."); 400 } 401 goto label_6;}; 402 } 403 else 404 { 405 //fprintf(stderr, "FAILING c_extract CNDA object w refcnt %p %i\n", py_V5, (py_V5->ob_refcnt)); 406 PyErr_SetString(PyExc_TypeError, "Argument not a CudaNdarray"); 407 V5 = NULL; 408 { 409 __failure = 6; 410 if (!PyErr_Occurred()) { 411 PyErr_SetString(PyExc_RuntimeError, 412 "Unexpected error in an Op's C code. " 413 "No Python exception was set."); 414 } 415 goto label_6;}; 416 } 417 //std::cerr << "c_extract done " << V5 << '\n'; 418
419 420 { 421 // Op class GpuDnnSoftmaxGrad 422 423 cudnnStatus_t errnode_6c2407cd47903371a6adb2201001b071_0; 424 cudnnTensorFormat_t formatnode_6c2407cd47903371a6adb2201001b071_0 = CUDNN_TENSOR_NCHW; 425 if (0 == 1) 426 formatnode_6c2407cd47903371a6adb2201001b071_0 = CUDNN_TENSOR_NHWC; 427 428 cudnnSoftmaxAlgorithm_t algonode_6c2407cd47903371a6adb2201001b071_0 = CUDNN_SOFTMAX_ACCURATE; 429 430 cudnnSoftmaxMode_t modenode_6c2407cd47903371a6adb2201001b071_0 = CUDNN_SOFTMAX_MODE_CHANNEL; 431 if (0 == 1) 432 modenode_6c2407cd47903371a6adb2201001b071_0 = CUDNN_SOFTMAX_MODE_INSTANCE; 433 434 { 435 int str0, str1, str2, str3; 436 str3 = CudaNdarray_HOST_STRIDES(V3)[3]?CudaNdarray_HOST_STRIDES(V3)[3]:1; 437 str2 = CudaNdarray_HOST_STRIDES(V3)[2]?CudaNdarray_HOST_STRIDES(V3)[2]:CudaNdarray_HOST_DIMS(V3)[3]; 438 str1 = CudaNdarray_HOST_STRIDES(V3)[1]?CudaNdarray_HOST_STRIDES(V3)[1]:CudaNdarray_HOST_DIMS(V3)[2]CudaNdarray_HOST_DIMS(V3)[3]; 439 str0 = CudaNdarray_HOST_STRIDES(V3)[0]?CudaNdarray_HOST_STRIDES(V3)[0]:CudaNdarray_HOST_DIMS(V3)[2]CudaNdarray_HOST_DIMS(V3)[3]CudaNdarray_HOST_DIMS(V3)[1]; 440 errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnSetTensor4dDescriptorEx( 441 softmax_gout_node_6c2407cd47903371a6adb2201001b071_0, CUDNN_DATA_FLOAT, 442 CudaNdarray_HOST_DIMS(V3)[0], 443 CudaNdarray_HOST_DIMS(V3)[1], 444 CudaNdarray_HOST_DIMS(V3)[2], 445 CudaNdarray_HOST_DIMS(V3)[3], 446 str0, str1, str2, str3 447 ); 448 if (errnode_6c2407cd47903371a6adb2201001b071_0 != CUDNN_STATUS_SUCCESS) { 449 PyErr_Format(PyExc_RuntimeError, 450 "could not set tensor4d descriptor: %s" 451 "shapes=%d %d %d %d strides=%d %d %d %d", 452 cudnnGetErrorString(errnode_6c2407cd47903371a6adb2201001b071_0), 453 CudaNdarray_HOST_DIMS(V3)[0], 454 CudaNdarray_HOST_DIMS(V3)[1], 455 CudaNdarray_HOST_DIMS(V3)[2], 456 CudaNdarray_HOST_DIMS(V3)[3], 457 str0, str1, str2, str3 458 ); 459 { 460 __failure = 7; 461 if (!PyErr_Occurred()) { 462 PyErr_SetString(PyExc_RuntimeError, 463 "Unexpected error in an Op's C code. " 464 "No Python exception was set."); 465 } 466 goto __label_7;} 467 } 468 } 469
470 { 471 int str0, str1, str2, str3; 472 str3 = CudaNdarray_HOST_STRIDES(V5)[3]?CudaNdarray_HOST_STRIDES(V5)[3]:1; 473 str2 = CudaNdarray_HOST_STRIDES(V5)[2]?CudaNdarray_HOST_STRIDES(V5)[2]:CudaNdarray_HOST_DIMS(V5)[3]; 474 str1 = CudaNdarray_HOST_STRIDES(V5)[1]?CudaNdarray_HOST_STRIDES(V5)[1]:CudaNdarray_HOST_DIMS(V5)[2]
CudaNdarray_HOST_DIMS(V5)[3]; 475 str0 = CudaNdarray_HOST_STRIDES(V5)[0]?CudaNdarray_HOST_STRIDES(V5)[0]:CudaNdarray_HOST_DIMS(V5)[2]CudaNdarray_HOST_DIMS(V5)[3]CudaNdarray_HOST_DIMS(V5)[1]; 476 errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnSetTensor4dDescriptorEx( 477 softmax_input_node_6c2407cd47903371a6adb2201001b071_0, CUDNN_DATA_FLOAT, 478 CudaNdarray_HOST_DIMS(V5)[0], 479 CudaNdarray_HOST_DIMS(V5)[1], 480 CudaNdarray_HOST_DIMS(V5)[2], 481 CudaNdarray_HOST_DIMS(V5)[3], 482 str0, str1, str2, str3 483 ); 484 if (errnode_6c2407cd47903371a6adb2201001b071_0 != CUDNN_STATUS_SUCCESS) { 485 PyErr_Format(PyExc_RuntimeError, 486 "could not set tensor4d descriptor: %s" 487 "shapes=%d %d %d %d strides=%d %d %d %d", 488 cudnnGetErrorString(errnode_6c2407cd47903371a6adb2201001b071_0), 489 CudaNdarray_HOST_DIMS(V5)[0], 490 CudaNdarray_HOST_DIMS(V5)[1], 491 CudaNdarray_HOST_DIMS(V5)[2], 492 CudaNdarray_HOST_DIMS(V5)[3], 493 str0, str1, str2, str3 494 ); 495 { 496 failure = 7; 497 if (!PyErr_Occurred()) { 498 PyErr_SetString(PyExc_RuntimeError, 499 "Unexpected error in an Op's C code. " 500 "No Python exception was set."); 501 } 502 goto __label_7;} 503 } 504 } 505
506 if (CudaNdarray_prep_output(&V1, 4, CudaNdarray_HOST_DIMS(V5)) != 0) 507 { 508 { 509
failure = 7; 510 if (!PyErr_Occurred()) { 511 PyErr_SetString(PyExc_RuntimeError, 512 "Unexpected error in an Op's C code. " 513 "No Python exception was set."); 514 } 515 goto label_7;} 516 } 517 518 { 519 int str0, str1, str2, str3; 520 str3 = CudaNdarray_HOST_STRIDES(V1)[3]?CudaNdarray_HOST_STRIDES(V1)[3]:1; 521 str2 = CudaNdarray_HOST_STRIDES(V1)[2]?CudaNdarray_HOST_STRIDES(V1)[2]:CudaNdarray_HOST_DIMS(V1)[3]; 522 str1 = CudaNdarray_HOST_STRIDES(V1)[1]?CudaNdarray_HOST_STRIDES(V1)[1]:CudaNdarray_HOST_DIMS(V1)[2]CudaNdarray_HOST_DIMS(V1)[3]; 523 str0 = CudaNdarray_HOST_STRIDES(V1)[0]?CudaNdarray_HOST_STRIDES(V1)[0]:CudaNdarray_HOST_DIMS(V1)[2]CudaNdarray_HOST_DIMS(V1)[3]*CudaNdarray_HOST_DIMS(V1)[1]; 524 errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnSetTensor4dDescriptorEx( 525 softmax_output_node_6c2407cd47903371a6adb2201001b071_0, CUDNN_DATA_FLOAT, 526 CudaNdarray_HOST_DIMS(V1)[0], 527 CudaNdarray_HOST_DIMS(V1)[1], 528 CudaNdarray_HOST_DIMS(V1)[2], 529 CudaNdarray_HOST_DIMS(V1)[3], 530 str0, str1, str2, str3 531 ); 532 if (errnode_6c2407cd47903371a6adb2201001b071_0 != CUDNN_STATUS_SUCCESS) { 533 PyErr_Format(PyExc_RuntimeError, 534 "could not set tensor4d descriptor: %s" 535 "shapes=%d %d %d %d strides=%d %d %d %d", 536 cudnnGetErrorString(errnode_6c2407cd47903371a6adb2201001b071_0), 537 CudaNdarray_HOST_DIMS(V1)[0], 538 CudaNdarray_HOST_DIMS(V1)[1], 539 CudaNdarray_HOST_DIMS(V1)[2], 540 CudaNdarray_HOST_DIMS(V1)[3], 541 str0, str1, str2, str3 542 ); 543 { 544 __failure = 7; 545 if (!PyErr_Occurred()) { 546 PyErr_SetString(PyExc_RuntimeError, 547 "Unexpected error in an Op's C code. " 548 "No Python exception was set."); 549 } 550 goto label_7;} 551 } 552 } 553
554 #ifndef CUDNN_VERSION 555 errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnSoftmaxBackward( 556 _handle, 557 algonode_6c2407cd47903371a6adb2201001b071_0, 558 modenode_6c2407cd47903371a6adb2201001b071_0, 559 softmax_input_node_6c2407cd47903371a6adb2201001b071_0, 560 CudaNdarray_DEV_DATA(V5), 561 softmax_gout_node_6c2407cd47903371a6adb2201001b071_0, 562 CudaNdarray_DEV_DATA(V3), 563 softmax_output_node_6c2407cd47903371a6adb2201001b071_0, 564 CudaNdarray_DEV_DATA(V1) 565 ); 566 #else 567 { 568 const float alpha = 1.; 569 const float beta = 0.; 570 errnode_6c2407cd47903371a6adb2201001b071_0 = cudnnSoftmaxBackward( 571 _handle, 572 algonode_6c2407cd47903371a6adb2201001b071_0, 573 modenode_6c2407cd47903371a6adb2201001b071_0, 574 (void) &alpha, 575 softmax_input_node_6c2407cd47903371a6adb2201001b071_0, 576 CudaNdarray_DEV_DATA(V5), 577 softmax_gout_node_6c2407cd47903371a6adb2201001b071_0, 578 CudaNdarray_DEV_DATA(V3), 579 (void) &beta, 580 softmax_output_node_6c2407cd47903371a6adb2201001b071_0, 581 CudaNdarray_DEV_DATA(V1) 582 ); 583 } 584 #endif 585 label_7: 586 587 double __DUMMY_7; 588 589 } 590 label_6: 591 592 //std::cerr << "cleanup " << py_V5 << " " << V5 << "\n"; 593 //fprintf(stderr, "c_cleanup CNDA py_object w refcnt %p %i\n", py_V5, (py_V5->ob_refcnt)); 594 if (V5) 595 { 596 //fprintf(stderr, "c_cleanup CNDA cn_object w refcnt %p %i\n", V5, (V5->ob_refcnt)); 597 Py_XDECREF(V5); 598 } 599 //std::cerr << "cleanup done" << py_V5 << "\n"; 600
601 {Py_XDECREF(py_V5);} 602
603 double DUMMY_6; 604 605 } 606 label_4: 607 608 //std::cerr << "cleanup " << py_V3 << " " << V3 << "\n"; 609 //fprintf(stderr, "c_cleanup CNDA py_object w refcnt %p %i\n", py_V3, (py_V3->ob_refcnt)); 610 if (V3) 611 { 612 //fprintf(stderr, "c_cleanup CNDA cn_object w refcnt %p %i\n", V3, (V3->ob_refcnt)); 613 Py_XDECREF(V3); 614 } 615 //std::cerr << "cleanup done" << py_V3 << "\n"; 616
617 {Py_XDECREF(py_V3);} 618
619 double
DUMMY_4; 620 621 } 622
label_2: 623 624 if (!failure) { 625
626 //std::cerr << "sync\n"; 627 if (NULL == V1) { 628 // failure: sync None to storage 629 Py_XDECREF(py_V1); 630 py_V1 = Py_None; 631 Py_INCREF(py_V1); 632 } 633 else 634 { 635 if (py_V1 != (PyObject)V1) 636 { 637 Py_XDECREF(py_V1); 638 py_V1 = (PyObject)V1; 639 Py_INCREF(py_V1); 640 } 641 assert(py_V1->ob_refcnt); 642 } 643
644 PyObject* old = PyList_GET_ITEM(storage_V1, 0); 645 {Py_XINCREF(py_V1);} 646 PyList_SET_ITEM(storage_V1, 0, py_V1); 647 {Py_XDECREF(old);} 648 } 649
650 //std::cerr << "cleanup " << py_V1 << " " << V1 << "\n"; 651 //fprintf(stderr, "c_cleanup CNDA py_object w refcnt %p %i\n", py_V1, (py_V1->ob_refcnt)); 652 if (V1) 653 { 654 //fprintf(stderr, "c_cleanup CNDA cn_object w refcnt %p %i\n", V1, (V1->ob_refcnt)); 655 Py_XDECREF(V1); 656 } 657 //std::cerr << "cleanup done" << py_V1 << "\n"; 658
659 {Py_XDECREF(py_V1);} 660
661 double __DUMMY_2; 662 663 } 664 665
666 if (
failure) { 667 // When there is a failure, this code puts the exception 668 // in ERROR. 669 PyObject err_type = NULL; 670 PyObject err_msg = NULL; 671 PyObject err_traceback = NULL; 672 PyErr_Fetch(&err_type, &err_msg, &err_traceback); 673 if (!err_type) {err_type = Py_None;Py_INCREF(Py_None);} 674 if (!err_msg) {err_msg = Py_None; Py_INCREF(Py_None);} 675 if (!err_traceback) {err_traceback = Py_None; Py_INCREF(Py_None);} 676 PyObject old_err_type = PyList_GET_ITEM(ERROR, 0); 677 PyObject old_err_msg = PyList_GET_ITEM(__ERROR, 1); 678 PyObject old_err_traceback = PyList_GET_ITEM(ERROR, 2); 679 PyList_SET_ITEM(ERROR, 0, err_type); 680 PyList_SET_ITEM(ERROR, 1, err_msg); 681 PyList_SET_ITEM(ERROR, 2, err_traceback); 682 {Py_XDECREF(old_err_type);} 683 {Py_XDECREF(old_err_msg);} 684 {Py_XDECREF(old_err_traceback);} 685 } 686 // The failure code is returned to index what code block failed. 687 return failure; 688
689 } 690 }; 691 } 692
693 694 static int struct_compiled_op_6c2407cd47903371a6adb2201001b071_executor(struct_compiled_op_6c2407cd47903371a6adb2201001b071* self) { 695 return self->run(); 696 } 697 698 static void
struct_compiled_op_6c2407cd47903371a6adb2201001b071_destructor(void executor, void self) { 699 delete ((struct_compiled_op_6c2407cd47903371a6adb2201001b071)self); 700 } 701
702 ////////////////////// 703 //// Functions 704 ////////////////////// 705 static PyObject
instantiate(PyObject self, PyObject argtuple) { 706 assert(PyTuple_Check(argtuple)); 707 if (4 != PyTuple_Size(argtuple)){ 708 PyErr_Format(PyExc_TypeError, "Wrong number of arguments, expected 4, got %i", (int)PyTuple_Size(argtuple)); 709 return NULL; 710 } 711
struct_compiled_op_6c2407cd47903371a6adb2201001b071 struct_ptr = new __struct_compiled_op_6c2407cd47903371a6adb2201001b071(); 712 if (struct_ptr->init( PyTuple_GET_ITEM(argtuple, 0),PyTuple_GET_ITEM(argtuple, 1),PyTuple_GET_ITEM(argtuple, 2),PyTuple_GET_ITEM(argtuple, 3) ) != 0) { 713 delete struct_ptr; 714 return NULL; 715 } 716 PyObject thunk = PyCObject_FromVoidPtrAndDesc((void*)(&__struct_compiled_op_6c2407cd47903371a6adb2201001b071_executor), struct_ptr, __struct_compiled_op_6c2407cd47903371a6adb2201001b071_destructor); 717 return thunk; } 718 719 ////////////////////// 720 //// Module init 721 ////////////////////// 722 static PyMethodDef MyMethods[] = { 723 {"instantiate", instantiate, METH_VARARGS, "undocumented"} , 724 {NULL, NULL, 0, NULL} 725 }; 726 PyMODINIT_FUNC init6c2407cd47903371a6adb2201001b071(void){ 727
728 729 { 730 cudnnStatus_t err; 731 if ((err = cudnnCreate(&_handle)) != CUDNN_STATUS_SUCCESS) { 732 PyErr_Format(PyExc_RuntimeError, "could not create cuDNN handle: %s", 733 cudnnGetErrorString(err)); 734 #if PY_MAJOR_VERSION >= 3 735 return NULL; 736 #else 737 return; 738 #endif 739 } 740 } 741 742 (void) Py_InitModule("6c2407cd47903371a6adb2201001b071", MyMethods); 743 } 744

mod.cu(67): error: identifier "cudnnSetFilterNdDescriptor_v4" is undefined mod.cu(16): warning: function "c_set_tensorNd" was declared but never referenced mod.cu(60): warning: function "c_set_filterNd" was declared but never referenced 1 error detected in the compilation of "/tmp/tmpxft_00000936_00000000-9_mod.cpp1.ii".

['/usr/local/cuda-8.0/bin/nvcc', '-shared', '-O3', '-Xlinker', '-rpath,/usr/local/cuda-8.0/lib64', '-use_fast_math', '-arch=sm_61', '-m64', '-Xcompiler', '-fno-math-errno,-Wno-unused-label,-Wno-unused-variable,-Wno-write-strings,-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden', '-Xlinker', '-rpath,/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray', '-I/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray', '-I/usr/local/cuda-8.0/include', '-I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda', '-I/home/dekuanxu/.local/lib/python2.7/site-packages/numpy/core/include', '-I/usr/include/python2.7', '-I/usr/local/lib/python2.7/dist-packages/theano/gof', '-o', '/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/tmpaVEUXH/6c2407cd47903371a6adb2201001b071.so', 'mod.cu', '-L/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray', '-L/usr/lib', '-lcudart', '-lcublas', '-lcuda_ndarray', '-lcudnn', '-lpython2.7'] Traceback (most recent call last): File "data/thumt/train.py", line 69, in update_grads, update_params = trainer.build(model.cost, model.inputs) File "/home/dekuanxu/THUMT/data/thumt/optimizer.py", line 97, in build update_grads = theano.function(inp, [cost, grad_norm], updates=update_gc + m_up + v_up) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 320, in function output_keys=output_keys) File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 479, in pfunc output_keys=output_keys) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1777, in orig_function defaults) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 1641, in create input_storage=input_storage_lists, storage_map=storage_map) File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 690, in make_thunk storage_map=storage_map)[:3] File "/usr/local/lib/python2.7/dist-packages/theano/gof/vm.py", line 1003, in make_all no_recycling)) File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/init.py", line 256, in make_thunk compute_map, no_recycling) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 970, in make_thunk no_recycling) File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 879, in make_c_thunk output_storage=node_output_storage) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1200, in make_thunk keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1143, in compile keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1595, in cthunk_factory key=key, lnk=self, keep_lock=keep_lock) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cmodule.py", line 1142, in module_from_key module = lnk.compile_cmodule(location) File "/usr/local/lib/python2.7/dist-packages/theano/gof/cc.py", line 1506, in compile_cmodule preargs=preargs) File "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/nvcc_compiler.py", line 399, in compile_str 'for cmd', ' '.join(cmd)) Exception: ('The following error happened while compiling the node', GpuDnnSoftmaxGrad{tensor_format='bc01', mode='channel', algo='accurate'}(GpuContiguous.0, GpuContiguous.0), '\n', 'nvcc return status', 2, 'for cmd', '/usr/local/cuda-8.0/bin/nvcc -shared -O3 -Xlinker -rpath,/usr/local/cuda-8.0/lib64 -use_fast_math -arch=sm_61 -m64 -Xcompiler -fno-math-errno,-Wno-unused-label,-Wno-unused-variable,-Wno-write-strings,-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden -Xlinker -rpath,/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -I/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -I/usr/local/cuda-8.0/include -I/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda -I/home/dekuanxu/.local/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -I/usr/local/lib/python2.7/dist-packages/theano/gof -o /home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/tmpaVEUXH/6c2407cd47903371a6adb2201001b071.so mod.cu -L/home/dekuanxu/.theano/compiledir_Linux-4.8--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -L/usr/lib -lcudart -lcublas -lcuda_ndarray -lcudnn -lpython2.7', "[GpuDnnSoftmaxGrad{tensor_format='bc01', mode='channel', algo='accurate'}(<CudaNdarrayType(float32, (False, False, True, True))>, <CudaNdarrayType(float32, (False, False, True, True))>)]") The training started at 2017-09-13 15:57:32 and ended at 2017-09-13 15:58:28. The total training time is 0.02 hour(s). xunliangpu.sh: 3: xunliangpu.sh: pause: not found

xudekuan commented 7 years ago

We used cuda 8.

Glaceon31 commented 7 years ago

Cuda 8.0 is ok. It seems that cuDNN version may cause the problem. Can you provide the version of theano? We need to reproduce the error in order to fix it.

xudekuan commented 7 years ago

THEANO is 0.8.2 installed after the instuction of section 2.2, Page 2 on your manual. Should I remove cuDNN 6021 and install cuDNN 5 to solve the problem? Our system is on a Server, and Tensorflow needs to use cuDNN 6021, maybe it will cause problem if I remove cuDNN 6021 .

xudekuan commented 7 years ago

我问了一下服务器管理员,他说GPU出错,可能是硬件问题 NVIDIA-SMI 384.69 Driver Version: 384.69 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 TITAN Xp Off | 00000000:02:00.0 Off | N/A | | 23% 40C P0 61W / 250W | 0MiB / 12189MiB | 0% Default |

Glaceon31 commented 7 years ago

I cannot give a stable solution for now. Downgrading cuDNN to 5 or upgrading Theano may work. I will leave this issue open until we or someone else get a solution.