Closed TaoLv closed 4 years ago
NFO:gluonnlp:10:28:18 Namespace(accumulate=None, batch_size=12, bert_dataset='book_corpus_wiki_en_uncased', bert_model='bert_12_768_12', calib_mode='customize', comm_backend=None, debug=False, deploy=False, doc_stride=128, dtype='float32', epochs=2, gpu=True, log_interval=50, lr=3e-05, max_answer_length=30, max_query_length=64, max_seq_length=384, model_parameters=None, model_prefix=None, n_best_size=20, null_score_diff_threshold=0.0, num_calib_batches=10, only_calibration=False, only_predict=False, optimizer='adam', output_dir='./output_dir', pretrained_bert_parameters=None, quantized_dtype='auto', round_to=None, sentencepiece=None, test_batch_size=24, training_steps=None, uncased=True, version_2=False, warmup_ratio=0.1) INFO:gluonnlp:10:28:25 Loading train data... INFO:gluonnlp:10:28:26 Number of records in Train data:87599 INFO:gluonnlp:10:29:04 The number of examples after preprocessing:88641 Done! Transform dataset costs 37.85 seconds. INFO:gluonnlp:10:29:04 Start Training INFO:gluonnlp:10:29:19 Batch: 49/7387, Loss=5.7366, lr=0.0000010 Thoughput=41.72 samples/s INFO:gluonnlp:10:29:32 Batch: 99/7387, Loss=5.6931, lr=0.0000020 Thoughput=45.47 samples/s INFO:gluonnlp:10:29:44 Batch: 149/7387, Loss=5.4705, lr=0.0000030 Thoughput=48.23 samples/s INFO:gluonnlp:10:29:57 Batch: 199/7387, Loss=5.3252, lr=0.0000041 Thoughput=47.67 samples/s INFO:gluonnlp:10:30:10 Batch: 249/7387, Loss=5.0779, lr=0.0000051 Thoughput=46.84 samples/s INFO:gluonnlp:10:30:22 Batch: 299/7387, Loss=4.8238, lr=0.0000061 Thoughput=47.40 samples/s INFO:gluonnlp:10:30:35 Batch: 349/7387, Loss=4.5862, lr=0.0000071 Thoughput=46.46 samples/s INFO:gluonnlp:10:30:48 Batch: 399/7387, Loss=4.2767, lr=0.0000081 Thoughput=47.08 samples/s INFO:gluonnlp:10:31:01 Batch: 449/7387, Loss=4.1058, lr=0.0000091 Thoughput=46.85 samples/s INFO:gluonnlp:10:31:13 Batch: 499/7387, Loss=4.0107, lr=0.0000102 Thoughput=48.32 samples/s INFO:gluonnlp:10:31:26 Batch: 549/7387, Loss=3.8868, lr=0.0000112 Thoughput=48.08 samples/s INFO:gluonnlp:10:31:39 Batch: 599/7387, Loss=3.7927, lr=0.0000122 Thoughput=46.52 samples/s INFO:gluonnlp:10:31:51 Batch: 649/7387, Loss=3.6045, lr=0.0000132 Thoughput=48.48 samples/s INFO:gluonnlp:10:32:04 Batch: 699/7387, Loss=3.3329, lr=0.0000142 Thoughput=45.18 samples/s INFO:gluonnlp:10:32:18 Batch: 749/7387, Loss=3.1370, lr=0.0000152 Thoughput=43.95 samples/s INFO:gluonnlp:10:32:31 Batch: 799/7387, Loss=2.9287, lr=0.0000162 Thoughput=45.27 samples/s INFO:gluonnlp:10:32:45 Batch: 849/7387, Loss=2.9110, lr=0.0000173 Thoughput=44.93 samples/s INFO:gluonnlp:10:32:59 Batch: 899/7387, Loss=2.7632, lr=0.0000183 Thoughput=41.90 samples/s INFO:gluonnlp:10:33:12 Batch: 949/7387, Loss=2.7786, lr=0.0000193 Thoughput=45.55 samples/s INFO:gluonnlp:10:33:25 Batch: 999/7387, Loss=2.8667, lr=0.0000203 Thoughput=45.36 samples/s INFO:gluonnlp:10:33:39 Batch: 1049/7387, Loss=2.7649, lr=0.0000213 Thoughput=43.45 samples/s INFO:gluonnlp:10:33:52 Batch: 1099/7387, Loss=2.7784, lr=0.0000223 Thoughput=46.00 samples/s INFO:gluonnlp:10:34:06 Batch: 1149/7387, Loss=2.6492, lr=0.0000234 Thoughput=44.58 samples/s INFO:gluonnlp:10:34:19 Batch: 1199/7387, Loss=2.7466, lr=0.0000244 Thoughput=43.60 samples/s INFO:gluonnlp:10:34:33 Batch: 1249/7387, Loss=2.7283, lr=0.0000254 Thoughput=44.27 samples/s INFO:gluonnlp:10:34:47 Batch: 1299/7387, Loss=2.6654, lr=0.0000264 Thoughput=43.96 samples/s INFO:gluonnlp:10:35:00 Batch: 1349/7387, Loss=2.7920, lr=0.0000274 Thoughput=44.62 samples/s INFO:gluonnlp:10:35:14 Batch: 1399/7387, Loss=2.8768, lr=0.0000284 Thoughput=43.34 samples/s INFO:gluonnlp:10:35:28 Batch: 1449/7387, Loss=2.8342, lr=0.0000295 Thoughput=41.64 samples/s INFO:gluonnlp:10:35:42 Batch: 1499/7387, Loss=2.9186, lr=0.0000299 Thoughput=44.33 samples/s INFO:gluonnlp:10:35:56 Batch: 1549/7387, Loss=2.8817, lr=0.0000298 Thoughput=41.69 samples/s INFO:gluonnlp:10:36:10 Batch: 1599/7387, Loss=3.0527, lr=0.0000297 Thoughput=44.38 samples/s INFO:gluonnlp:10:36:23 Batch: 1649/7387, Loss=3.1255, lr=0.0000296 Thoughput=44.45 samples/s INFO:gluonnlp:10:36:37 Batch: 1699/7387, Loss=3.1396, lr=0.0000295 Thoughput=43.95 samples/s INFO:gluonnlp:10:36:51 Batch: 1749/7387, Loss=3.3614, lr=0.0000294 Thoughput=42.61 samples/s INFO:gluonnlp:10:37:05 Batch: 1799/7387, Loss=3.3383, lr=0.0000293 Thoughput=42.46 samples/s INFO:gluonnlp:10:37:19 Batch: 1849/7387, Loss=3.4652, lr=0.0000292 Thoughput=44.64 samples/s INFO:gluonnlp:10:37:32 Batch: 1899/7387, Loss=3.3199, lr=0.0000290 Thoughput=44.89 samples/s INFO:gluonnlp:10:37:46 Batch: 1949/7387, Loss=3.5690, lr=0.0000289 Thoughput=41.94 samples/s INFO:gluonnlp:10:38:00 Batch: 1999/7387, Loss=3.5586, lr=0.0000288 Thoughput=44.53 samples/s INFO:gluonnlp:10:38:14 Batch: 2049/7387, Loss=3.7966, lr=0.0000287 Thoughput=42.84 samples/s INFO:gluonnlp:10:38:27 Batch: 2099/7387, Loss=3.5441, lr=0.0000286 Thoughput=45.42 samples/s INFO:gluonnlp:10:38:41 Batch: 2149/7387, Loss=3.7021, lr=0.0000285 Thoughput=43.08 samples/s INFO:gluonnlp:10:38:54 Batch: 2199/7387, Loss=3.7285, lr=0.0000284 Thoughput=44.25 samples/s INFO:gluonnlp:10:39:08 Batch: 2249/7387, Loss=3.8293, lr=0.0000283 Thoughput=43.60 samples/s INFO:gluonnlp:10:39:22 Batch: 2299/7387, Loss=3.8185, lr=0.0000281 Thoughput=44.31 samples/s INFO:gluonnlp:10:39:35 Batch: 2349/7387, Loss=3.8284, lr=0.0000280 Thoughput=44.55 samples/s INFO:gluonnlp:10:39:49 Batch: 2399/7387, Loss=4.0185, lr=0.0000279 Thoughput=42.95 samples/s INFO:gluonnlp:10:40:04 Batch: 2449/7387, Loss=3.9258, lr=0.0000278 Thoughput=41.74 samples/s INFO:gluonnlp:10:40:17 Batch: 2499/7387, Loss=3.9495, lr=0.0000277 Thoughput=45.10 samples/s INFO:gluonnlp:10:40:31 Batch: 2549/7387, Loss=3.9302, lr=0.0000276 Thoughput=43.96 samples/s INFO:gluonnlp:10:40:44 Batch: 2599/7387, Loss=3.9502, lr=0.0000275 Thoughput=45.95 samples/s INFO:gluonnlp:10:40:57 Batch: 2649/7387, Loss=3.9685, lr=0.0000274 Thoughput=44.44 samples/s INFO:gluonnlp:10:41:10 Batch: 2699/7387, Loss=4.0349, lr=0.0000272 Thoughput=45.72 samples/s INFO:gluonnlp:10:41:24 Batch: 2749/7387, Loss=4.0281, lr=0.0000271 Thoughput=44.21 samples/s INFO:gluonnlp:10:41:37 Batch: 2799/7387, Loss=3.9801, lr=0.0000270 Thoughput=44.92 samples/s INFO:gluonnlp:10:41:50 Batch: 2849/7387, Loss=3.9873, lr=0.0000269 Thoughput=45.49 samples/s INFO:gluonnlp:10:42:04 Batch: 2899/7387, Loss=4.1328, lr=0.0000268 Thoughput=43.60 samples/s INFO:gluonnlp:10:42:17 Batch: 2949/7387, Loss=4.0381, lr=0.0000267 Thoughput=46.05 samples/s INFO:gluonnlp:10:42:30 Batch: 2999/7387, Loss=3.9962, lr=0.0000266 Thoughput=45.34 samples/s INFO:gluonnlp:10:42:44 Batch: 3049/7387, Loss=4.0731, lr=0.0000265 Thoughput=44.03 samples/s INFO:gluonnlp:10:42:57 Batch: 3099/7387, Loss=4.1017, lr=0.0000263 Thoughput=44.74 samples/s INFO:gluonnlp:10:43:11 Batch: 3149/7387, Loss=3.9907, lr=0.0000262 Thoughput=44.34 samples/s INFO:gluonnlp:10:43:24 Batch: 3199/7387, Loss=4.0384, lr=0.0000261 Thoughput=45.02 samples/s INFO:gluonnlp:10:43:38 Batch: 3249/7387, Loss=4.0809, lr=0.0000260 Thoughput=42.78 samples/s INFO:gluonnlp:10:43:52 Batch: 3299/7387, Loss=3.9960, lr=0.0000259 Thoughput=44.90 samples/s INFO:gluonnlp:10:44:05 Batch: 3349/7387, Loss=4.0318, lr=0.0000258 Thoughput=44.17 samples/s INFO:gluonnlp:10:44:19 Batch: 3399/7387, Loss=4.0101, lr=0.0000257 Thoughput=44.25 samples/s INFO:gluonnlp:10:44:33 Batch: 3449/7387, Loss=4.0862, lr=0.0000255 Thoughput=43.34 samples/s INFO:gluonnlp:10:44:46 Batch: 3499/7387, Loss=4.0828, lr=0.0000254 Thoughput=43.57 samples/s INFO:gluonnlp:10:45:00 Batch: 3549/7387, Loss=4.1510, lr=0.0000253 Thoughput=45.25 samples/s INFO:gluonnlp:10:45:12 Batch: 3599/7387, Loss=4.0614, lr=0.0000252 Thoughput=46.77 samples/s INFO:gluonnlp:10:45:27 Batch: 3649/7387, Loss=4.0762, lr=0.0000251 Thoughput=41.75 samples/s INFO:gluonnlp:10:45:40 Batch: 3699/7387, Loss=4.0421, lr=0.0000250 Thoughput=45.71 samples/s INFO:gluonnlp:10:45:53 Batch: 3749/7387, Loss=4.1429, lr=0.0000249 Thoughput=44.66 samples/s INFO:gluonnlp:10:46:07 Batch: 3799/7387, Loss=4.0960, lr=0.0000248 Thoughput=45.11 samples/s INFO:gluonnlp:10:46:21 Batch: 3849/7387, Loss=4.1295, lr=0.0000246 Thoughput=40.82 samples/s INFO:gluonnlp:10:46:35 Batch: 3899/7387, Loss=4.0449, lr=0.0000245 Thoughput=45.64 samples/s INFO:gluonnlp:10:46:48 Batch: 3949/7387, Loss=4.0744, lr=0.0000244 Thoughput=43.96 samples/s INFO:gluonnlp:10:47:01 Batch: 3999/7387, Loss=4.0502, lr=0.0000243 Thoughput=45.63 samples/s INFO:gluonnlp:10:47:15 Batch: 4049/7387, Loss=4.1235, lr=0.0000242 Thoughput=45.42 samples/s INFO:gluonnlp:10:47:28 Batch: 4099/7387, Loss=4.1194, lr=0.0000241 Thoughput=44.43 samples/s INFO:gluonnlp:10:47:42 Batch: 4149/7387, Loss=4.1620, lr=0.0000240 Thoughput=43.85 samples/s INFO:gluonnlp:10:47:56 Batch: 4199/7387, Loss=4.1474, lr=0.0000239 Thoughput=42.49 samples/s INFO:gluonnlp:10:48:09 Batch: 4249/7387, Loss=4.1311, lr=0.0000237 Thoughput=44.33 samples/s INFO:gluonnlp:10:48:23 Batch: 4299/7387, Loss=4.1637, lr=0.0000236 Thoughput=44.21 samples/s INFO:gluonnlp:10:48:36 Batch: 4349/7387, Loss=4.1676, lr=0.0000235 Thoughput=44.64 samples/s INFO:gluonnlp:10:48:50 Batch: 4399/7387, Loss=4.0311, lr=0.0000234 Thoughput=45.67 samples/s INFO:gluonnlp:10:49:03 Batch: 4449/7387, Loss=4.1035, lr=0.0000233 Thoughput=45.05 samples/s INFO:gluonnlp:10:49:17 Batch: 4499/7387, Loss=4.1394, lr=0.0000232 Thoughput=43.84 samples/s INFO:gluonnlp:10:49:30 Batch: 4549/7387, Loss=4.1112, lr=0.0000231 Thoughput=45.08 samples/s INFO:gluonnlp:10:49:43 Batch: 4599/7387, Loss=4.1363, lr=0.0000230 Thoughput=44.49 samples/s INFO:gluonnlp:10:49:57 Batch: 4649/7387, Loss=4.0765, lr=0.0000228 Thoughput=44.56 samples/s INFO:gluonnlp:10:50:10 Batch: 4699/7387, Loss=4.1965, lr=0.0000227 Thoughput=45.77 samples/s INFO:gluonnlp:10:50:24 Batch: 4749/7387, Loss=4.1263, lr=0.0000226 Thoughput=42.68 samples/s INFO:gluonnlp:10:50:37 Batch: 4799/7387, Loss=4.0919, lr=0.0000225 Thoughput=46.86 samples/s INFO:gluonnlp:10:50:51 Batch: 4849/7387, Loss=4.2508, lr=0.0000224 Thoughput=43.28 samples/s INFO:gluonnlp:10:51:05 Batch: 4899/7387, Loss=4.2040, lr=0.0000223 Thoughput=42.47 samples/s INFO:gluonnlp:10:51:18 Batch: 4949/7387, Loss=4.1226, lr=0.0000222 Thoughput=46.83 samples/s INFO:gluonnlp:10:51:31 Batch: 4999/7387, Loss=4.1356, lr=0.0000221 Thoughput=44.62 samples/s INFO:gluonnlp:10:51:45 Batch: 5049/7387, Loss=4.1283, lr=0.0000219 Thoughput=43.60 samples/s INFO:gluonnlp:10:51:58 Batch: 5099/7387, Loss=4.2041, lr=0.0000218 Thoughput=44.99 samples/s INFO:gluonnlp:10:52:12 Batch: 5149/7387, Loss=4.1447, lr=0.0000217 Thoughput=44.54 samples/s INFO:gluonnlp:10:52:25 Batch: 5199/7387, Loss=4.1981, lr=0.0000216 Thoughput=44.35 samples/s INFO:gluonnlp:10:52:39 Batch: 5249/7387, Loss=4.1360, lr=0.0000215 Thoughput=44.51 samples/s INFO:gluonnlp:10:52:52 Batch: 5299/7387, Loss=4.2430, lr=0.0000214 Thoughput=43.69 samples/s INFO:gluonnlp:10:53:06 Batch: 5349/7387, Loss=4.1508, lr=0.0000213 Thoughput=44.36 samples/s INFO:gluonnlp:10:53:19 Batch: 5399/7387, Loss=4.0919, lr=0.0000211 Thoughput=44.89 samples/s INFO:gluonnlp:10:53:33 Batch: 5449/7387, Loss=4.2194, lr=0.0000210 Thoughput=44.67 samples/s INFO:gluonnlp:10:53:46 Batch: 5499/7387, Loss=4.1684, lr=0.0000209 Thoughput=45.08 samples/s INFO:gluonnlp:10:53:59 Batch: 5549/7387, Loss=4.1586, lr=0.0000208 Thoughput=44.64 samples/s INFO:gluonnlp:10:54:13 Batch: 5599/7387, Loss=4.1701, lr=0.0000207 Thoughput=44.25 samples/s INFO:gluonnlp:10:54:26 Batch: 5649/7387, Loss=4.2287, lr=0.0000206 Thoughput=45.33 samples/s INFO:gluonnlp:10:54:40 Batch: 5699/7387, Loss=4.1584, lr=0.0000205 Thoughput=44.12 samples/s INFO:gluonnlp:10:54:53 Batch: 5749/7387, Loss=4.1876, lr=0.0000204 Thoughput=45.88 samples/s INFO:gluonnlp:10:55:06 Batch: 5799/7387, Loss=4.2106, lr=0.0000202 Thoughput=45.62 samples/s INFO:gluonnlp:10:55:21 Batch: 5849/7387, Loss=4.1353, lr=0.0000201 Thoughput=39.74 samples/s INFO:gluonnlp:10:55:34 Batch: 5899/7387, Loss=4.1966, lr=0.0000200 Thoughput=45.22 samples/s INFO:gluonnlp:10:55:48 Batch: 5949/7387, Loss=4.1623, lr=0.0000199 Thoughput=44.90 samples/s INFO:gluonnlp:10:56:01 Batch: 5999/7387, Loss=4.1149, lr=0.0000198 Thoughput=46.68 samples/s INFO:gluonnlp:10:56:14 Batch: 6049/7387, Loss=4.1312, lr=0.0000197 Thoughput=44.63 samples/s INFO:gluonnlp:10:56:27 Batch: 6099/7387, Loss=4.1174, lr=0.0000196 Thoughput=46.13 samples/s INFO:gluonnlp:10:56:40 Batch: 6149/7387, Loss=4.1856, lr=0.0000195 Thoughput=45.87 samples/s INFO:gluonnlp:10:56:53 Batch: 6199/7387, Loss=4.0843, lr=0.0000193 Thoughput=47.09 samples/s INFO:gluonnlp:10:57:07 Batch: 6249/7387, Loss=4.1435, lr=0.0000192 Thoughput=43.75 samples/s INFO:gluonnlp:10:57:20 Batch: 6299/7387, Loss=4.1400, lr=0.0000191 Thoughput=44.17 samples/s INFO:gluonnlp:10:57:34 Batch: 6349/7387, Loss=4.1790, lr=0.0000190 Thoughput=42.81 samples/s INFO:gluonnlp:10:57:48 Batch: 6399/7387, Loss=4.1604, lr=0.0000189 Thoughput=44.76 samples/s INFO:gluonnlp:10:58:01 Batch: 6449/7387, Loss=4.1715, lr=0.0000188 Thoughput=44.97 samples/s INFO:gluonnlp:10:58:14 Batch: 6499/7387, Loss=4.0780, lr=0.0000187 Thoughput=44.98 samples/s INFO:gluonnlp:10:58:28 Batch: 6549/7387, Loss=4.2045, lr=0.0000186 Thoughput=45.08 samples/s INFO:gluonnlp:10:58:42 Batch: 6599/7387, Loss=4.0806, lr=0.0000184 Thoughput=41.89 samples/s INFO:gluonnlp:10:58:56 Batch: 6649/7387, Loss=4.1759, lr=0.0000183 Thoughput=41.65 samples/s INFO:gluonnlp:10:59:10 Batch: 6699/7387, Loss=4.1486, lr=0.0000182 Thoughput=43.76 samples/s INFO:gluonnlp:10:59:23 Batch: 6749/7387, Loss=4.1671, lr=0.0000181 Thoughput=44.89 samples/s INFO:gluonnlp:10:59:36 Batch: 6799/7387, Loss=4.1012, lr=0.0000180 Thoughput=46.14 samples/s INFO:gluonnlp:10:59:50 Batch: 6849/7387, Loss=4.1678, lr=0.0000179 Thoughput=45.07 samples/s INFO:gluonnlp:11:00:03 Batch: 6899/7387, Loss=4.1457, lr=0.0000178 Thoughput=45.28 samples/s INFO:gluonnlp:11:00:17 Batch: 6949/7387, Loss=4.2047, lr=0.0000177 Thoughput=42.11 samples/s INFO:gluonnlp:11:00:30 Batch: 6999/7387, Loss=4.1246, lr=0.0000175 Thoughput=46.11 samples/s INFO:gluonnlp:11:00:44 Batch: 7049/7387, Loss=4.1637, lr=0.0000174 Thoughput=45.06 samples/s INFO:gluonnlp:11:00:57 Batch: 7099/7387, Loss=4.2105, lr=0.0000173 Thoughput=43.85 samples/s INFO:gluonnlp:11:01:10 Batch: 7149/7387, Loss=4.0982, lr=0.0000172 Thoughput=46.40 samples/s INFO:gluonnlp:11:01:23 Batch: 7199/7387, Loss=4.2186, lr=0.0000171 Thoughput=47.99 samples/s INFO:gluonnlp:11:01:36 Batch: 7249/7387, Loss=4.1285, lr=0.0000170 Thoughput=45.24 samples/s INFO:gluonnlp:11:01:50 Batch: 7299/7387, Loss=4.1594, lr=0.0000169 Thoughput=44.19 samples/s INFO:gluonnlp:11:02:03 Batch: 7349/7387, Loss=4.0565, lr=0.0000167 Thoughput=45.46 samples/s INFO:gluonnlp:11:02:12 Finish training step: 7387 INFO:gluonnlp:11:02:12 Time cost=1987.69 s, Thoughput=44.60 samples/s INFO:gluonnlp:11:02:16 Batch: 12/7387, Loss=4.0776, lr=0.0000166 Thoughput=45.75 samples/s INFO:gluonnlp:11:02:30 Batch: 62/7387, Loss=4.0591, lr=0.0000165 Thoughput=43.75 samples/s INFO:gluonnlp:11:02:43 Batch: 112/7387, Loss=4.0958, lr=0.0000164 Thoughput=43.51 samples/s INFO:gluonnlp:11:02:56 Batch: 162/7387, Loss=4.0935, lr=0.0000163 Thoughput=45.73 samples/s INFO:gluonnlp:11:03:10 Batch: 212/7387, Loss=4.1058, lr=0.0000162 Thoughput=43.64 samples/s INFO:gluonnlp:11:03:24 Batch: 262/7387, Loss=4.1321, lr=0.0000161 Thoughput=44.69 samples/s INFO:gluonnlp:11:03:37 Batch: 312/7387, Loss=4.0636, lr=0.0000160 Thoughput=45.16 samples/s INFO:gluonnlp:11:03:51 Batch: 362/7387, Loss=4.1577, lr=0.0000158 Thoughput=41.34 samples/s INFO:gluonnlp:11:04:05 Batch: 412/7387, Loss=4.1470, lr=0.0000157 Thoughput=43.45 samples/s INFO:gluonnlp:11:04:18 Batch: 462/7387, Loss=4.1421, lr=0.0000156 Thoughput=46.59 samples/s INFO:gluonnlp:11:04:31 Batch: 512/7387, Loss=4.0959, lr=0.0000155 Thoughput=46.75 samples/s INFO:gluonnlp:11:04:44 Batch: 562/7387, Loss=4.1666, lr=0.0000154 Thoughput=46.41 samples/s INFO:gluonnlp:11:04:58 Batch: 612/7387, Loss=4.1476, lr=0.0000153 Thoughput=43.52 samples/s INFO:gluonnlp:11:05:11 Batch: 662/7387, Loss=4.1270, lr=0.0000152 Thoughput=45.24 samples/s INFO:gluonnlp:11:05:25 Batch: 712/7387, Loss=4.1495, lr=0.0000151 Thoughput=42.43 samples/s INFO:gluonnlp:11:05:39 Batch: 762/7387, Loss=4.1315, lr=0.0000149 Thoughput=42.86 samples/s INFO:gluonnlp:11:05:52 Batch: 812/7387, Loss=4.2071, lr=0.0000148 Thoughput=45.86 samples/s INFO:gluonnlp:11:06:05 Batch: 862/7387, Loss=4.0890, lr=0.0000147 Thoughput=46.23 samples/s INFO:gluonnlp:11:06:19 Batch: 912/7387, Loss=4.1006, lr=0.0000146 Thoughput=44.46 samples/s INFO:gluonnlp:11:06:32 Batch: 962/7387, Loss=4.0869, lr=0.0000145 Thoughput=45.78 samples/s INFO:gluonnlp:11:06:45 Batch: 1012/7387, Loss=4.0929, lr=0.0000144 Thoughput=46.82 samples/s INFO:gluonnlp:11:06:58 Batch: 1062/7387, Loss=4.0652, lr=0.0000143 Thoughput=45.73 samples/s INFO:gluonnlp:11:07:11 Batch: 1112/7387, Loss=4.1306, lr=0.0000142 Thoughput=43.72 samples/s INFO:gluonnlp:11:07:25 Batch: 1162/7387, Loss=4.1885, lr=0.0000140 Thoughput=44.05 samples/s INFO:gluonnlp:11:07:39 Batch: 1212/7387, Loss=4.1907, lr=0.0000139 Thoughput=42.50 samples/s INFO:gluonnlp:11:07:53 Batch: 1262/7387, Loss=4.0502, lr=0.0000138 Thoughput=42.38 samples/s INFO:gluonnlp:11:08:07 Batch: 1312/7387, Loss=4.1771, lr=0.0000137 Thoughput=43.80 samples/s INFO:gluonnlp:11:08:21 Batch: 1362/7387, Loss=4.1444, lr=0.0000136 Thoughput=43.93 samples/s INFO:gluonnlp:11:08:34 Batch: 1412/7387, Loss=4.1231, lr=0.0000135 Thoughput=46.40 samples/s INFO:gluonnlp:11:08:47 Batch: 1462/7387, Loss=4.1259, lr=0.0000134 Thoughput=44.89 samples/s INFO:gluonnlp:11:09:01 Batch: 1512/7387, Loss=4.0674, lr=0.0000133 Thoughput=43.95 samples/s INFO:gluonnlp:11:09:14 Batch: 1562/7387, Loss=4.1281, lr=0.0000131 Thoughput=44.46 samples/s INFO:gluonnlp:11:09:28 Batch: 1612/7387, Loss=4.0987, lr=0.0000130 Thoughput=44.48 samples/s INFO:gluonnlp:11:09:42 Batch: 1662/7387, Loss=4.2103, lr=0.0000129 Thoughput=41.33 samples/s INFO:gluonnlp:11:09:55 Batch: 1712/7387, Loss=4.0690, lr=0.0000128 Thoughput=47.13 samples/s INFO:gluonnlp:11:10:09 Batch: 1762/7387, Loss=4.1266, lr=0.0000127 Thoughput=43.28 samples/s INFO:gluonnlp:11:10:22 Batch: 1812/7387, Loss=4.1022, lr=0.0000126 Thoughput=46.63 samples/s INFO:gluonnlp:11:10:35 Batch: 1862/7387, Loss=4.0887, lr=0.0000125 Thoughput=43.36 samples/s INFO:gluonnlp:11:10:49 Batch: 1912/7387, Loss=4.1930, lr=0.0000123 Thoughput=43.75 samples/s INFO:gluonnlp:11:11:02 Batch: 1962/7387, Loss=4.0607, lr=0.0000122 Thoughput=45.23 samples/s INFO:gluonnlp:11:11:16 Batch: 2012/7387, Loss=4.1371, lr=0.0000121 Thoughput=43.70 samples/s INFO:gluonnlp:11:11:30 Batch: 2062/7387, Loss=4.1286, lr=0.0000120 Thoughput=43.74 samples/s INFO:gluonnlp:11:11:43 Batch: 2112/7387, Loss=4.0778, lr=0.0000119 Thoughput=44.38 samples/s INFO:gluonnlp:11:11:57 Batch: 2162/7387, Loss=4.0880, lr=0.0000118 Thoughput=44.37 samples/s INFO:gluonnlp:11:12:10 Batch: 2212/7387, Loss=4.1034, lr=0.0000117 Thoughput=44.49 samples/s INFO:gluonnlp:11:12:24 Batch: 2262/7387, Loss=4.1207, lr=0.0000116 Thoughput=44.37 samples/s INFO:gluonnlp:11:12:37 Batch: 2312/7387, Loss=4.0498, lr=0.0000114 Thoughput=45.33 samples/s INFO:gluonnlp:11:12:50 Batch: 2362/7387, Loss=4.1283, lr=0.0000113 Thoughput=45.50 samples/s INFO:gluonnlp:11:13:04 Batch: 2412/7387, Loss=4.0925, lr=0.0000112 Thoughput=43.62 samples/s INFO:gluonnlp:11:13:18 Batch: 2462/7387, Loss=4.1359, lr=0.0000111 Thoughput=43.71 samples/s INFO:gluonnlp:11:13:31 Batch: 2512/7387, Loss=4.1376, lr=0.0000110 Thoughput=43.78 samples/s INFO:gluonnlp:11:13:45 Batch: 2562/7387, Loss=4.1093, lr=0.0000109 Thoughput=44.49 samples/s INFO:gluonnlp:11:13:58 Batch: 2612/7387, Loss=4.2317, lr=0.0000108 Thoughput=45.39 samples/s INFO:gluonnlp:11:14:12 Batch: 2662/7387, Loss=4.1057, lr=0.0000107 Thoughput=44.15 samples/s INFO:gluonnlp:11:14:25 Batch: 2712/7387, Loss=4.1259, lr=0.0000105 Thoughput=44.01 samples/s INFO:gluonnlp:11:14:40 Batch: 2762/7387, Loss=4.1273, lr=0.0000104 Thoughput=42.33 samples/s INFO:gluonnlp:11:14:53 Batch: 2812/7387, Loss=4.1457, lr=0.0000103 Thoughput=43.42 samples/s INFO:gluonnlp:11:15:07 Batch: 2862/7387, Loss=4.0793, lr=0.0000102 Thoughput=45.70 samples/s INFO:gluonnlp:11:15:19 Batch: 2912/7387, Loss=4.0774, lr=0.0000101 Thoughput=46.38 samples/s INFO:gluonnlp:11:15:34 Batch: 2962/7387, Loss=4.1106, lr=0.0000100 Thoughput=42.38 samples/s INFO:gluonnlp:11:15:47 Batch: 3012/7387, Loss=4.1557, lr=0.0000099 Thoughput=45.78 samples/s INFO:gluonnlp:11:16:00 Batch: 3062/7387, Loss=4.1707, lr=0.0000098 Thoughput=43.96 samples/s INFO:gluonnlp:11:16:14 Batch: 3112/7387, Loss=4.1613, lr=0.0000096 Thoughput=44.49 samples/s INFO:gluonnlp:11:16:27 Batch: 3162/7387, Loss=4.1119, lr=0.0000095 Thoughput=46.41 samples/s INFO:gluonnlp:11:16:40 Batch: 3212/7387, Loss=4.1192, lr=0.0000094 Thoughput=44.79 samples/s INFO:gluonnlp:11:16:54 Batch: 3262/7387, Loss=4.0926, lr=0.0000093 Thoughput=43.16 samples/s INFO:gluonnlp:11:17:07 Batch: 3312/7387, Loss=4.0888, lr=0.0000092 Thoughput=45.24 samples/s INFO:gluonnlp:11:17:21 Batch: 3362/7387, Loss=4.0483, lr=0.0000091 Thoughput=43.18 samples/s INFO:gluonnlp:11:17:35 Batch: 3412/7387, Loss=4.1036, lr=0.0000090 Thoughput=44.04 samples/s INFO:gluonnlp:11:17:48 Batch: 3462/7387, Loss=4.1874, lr=0.0000089 Thoughput=44.49 samples/s INFO:gluonnlp:11:18:01 Batch: 3512/7387, Loss=4.0687, lr=0.0000087 Thoughput=46.69 samples/s INFO:gluonnlp:11:18:15 Batch: 3562/7387, Loss=4.0894, lr=0.0000086 Thoughput=44.96 samples/s INFO:gluonnlp:11:18:28 Batch: 3612/7387, Loss=4.1437, lr=0.0000085 Thoughput=43.31 samples/s INFO:gluonnlp:11:18:43 Batch: 3662/7387, Loss=4.0851, lr=0.0000084 Thoughput=41.41 samples/s INFO:gluonnlp:11:18:55 Batch: 3712/7387, Loss=4.1065, lr=0.0000083 Thoughput=48.17 samples/s INFO:gluonnlp:11:19:09 Batch: 3762/7387, Loss=4.1050, lr=0.0000082 Thoughput=45.42 samples/s INFO:gluonnlp:11:19:21 Batch: 3812/7387, Loss=4.0368, lr=0.0000081 Thoughput=46.93 samples/s INFO:gluonnlp:11:19:34 Batch: 3862/7387, Loss=4.0824, lr=0.0000079 Thoughput=46.24 samples/s INFO:gluonnlp:11:19:48 Batch: 3912/7387, Loss=4.1474, lr=0.0000078 Thoughput=42.87 samples/s INFO:gluonnlp:11:20:02 Batch: 3962/7387, Loss=4.0872, lr=0.0000077 Thoughput=44.41 samples/s INFO:gluonnlp:11:20:15 Batch: 4012/7387, Loss=4.1012, lr=0.0000076 Thoughput=43.96 samples/s INFO:gluonnlp:11:20:28 Batch: 4062/7387, Loss=4.1431, lr=0.0000075 Thoughput=46.53 samples/s INFO:gluonnlp:11:20:42 Batch: 4112/7387, Loss=4.0852, lr=0.0000074 Thoughput=45.31 samples/s INFO:gluonnlp:11:20:55 Batch: 4162/7387, Loss=4.1129, lr=0.0000073 Thoughput=43.51 samples/s INFO:gluonnlp:11:21:08 Batch: 4212/7387, Loss=4.0370, lr=0.0000072 Thoughput=46.17 samples/s INFO:gluonnlp:11:21:21 Batch: 4262/7387, Loss=4.1498, lr=0.0000070 Thoughput=46.75 samples/s INFO:gluonnlp:11:21:35 Batch: 4312/7387, Loss=4.1244, lr=0.0000069 Thoughput=43.76 samples/s INFO:gluonnlp:11:21:48 Batch: 4362/7387, Loss=4.0967, lr=0.0000068 Thoughput=45.27 samples/s INFO:gluonnlp:11:22:01 Batch: 4412/7387, Loss=4.0400, lr=0.0000067 Thoughput=46.72 samples/s INFO:gluonnlp:11:22:14 Batch: 4462/7387, Loss=4.1163, lr=0.0000066 Thoughput=44.57 samples/s INFO:gluonnlp:11:22:28 Batch: 4512/7387, Loss=4.1048, lr=0.0000065 Thoughput=45.02 samples/s INFO:gluonnlp:11:22:41 Batch: 4562/7387, Loss=4.0797, lr=0.0000064 Thoughput=45.47 samples/s INFO:gluonnlp:11:22:55 Batch: 4612/7387, Loss=4.0614, lr=0.0000063 Thoughput=43.93 samples/s INFO:gluonnlp:11:23:08 Batch: 4662/7387, Loss=4.1530, lr=0.0000061 Thoughput=46.30 samples/s INFO:gluonnlp:11:23:21 Batch: 4712/7387, Loss=4.0868, lr=0.0000060 Thoughput=45.81 samples/s INFO:gluonnlp:11:23:34 Batch: 4762/7387, Loss=4.0923, lr=0.0000059 Thoughput=44.20 samples/s INFO:gluonnlp:11:23:47 Batch: 4812/7387, Loss=4.0161, lr=0.0000058 Thoughput=46.52 samples/s INFO:gluonnlp:11:24:00 Batch: 4862/7387, Loss=4.0832, lr=0.0000057 Thoughput=45.71 samples/s INFO:gluonnlp:11:24:13 Batch: 4912/7387, Loss=4.1316, lr=0.0000056 Thoughput=47.06 samples/s INFO:gluonnlp:11:24:27 Batch: 4962/7387, Loss=4.1091, lr=0.0000055 Thoughput=44.09 samples/s INFO:gluonnlp:11:24:40 Batch: 5012/7387, Loss=4.0840, lr=0.0000054 Thoughput=45.39 samples/s INFO:gluonnlp:11:24:53 Batch: 5062/7387, Loss=4.1174, lr=0.0000052 Thoughput=44.24 samples/s INFO:gluonnlp:11:25:07 Batch: 5112/7387, Loss=4.0487, lr=0.0000051 Thoughput=44.90 samples/s INFO:gluonnlp:11:25:20 Batch: 5162/7387, Loss=4.1211, lr=0.0000050 Thoughput=44.33 samples/s INFO:gluonnlp:11:25:34 Batch: 5212/7387, Loss=4.1463, lr=0.0000049 Thoughput=45.15 samples/s INFO:gluonnlp:11:25:47 Batch: 5262/7387, Loss=4.0731, lr=0.0000048 Thoughput=43.85 samples/s INFO:gluonnlp:11:26:00 Batch: 5312/7387, Loss=4.1587, lr=0.0000047 Thoughput=45.70 samples/s INFO:gluonnlp:11:26:14 Batch: 5362/7387, Loss=4.0584, lr=0.0000046 Thoughput=45.49 samples/s INFO:gluonnlp:11:26:26 Batch: 5412/7387, Loss=4.0978, lr=0.0000045 Thoughput=47.34 samples/s INFO:gluonnlp:11:26:40 Batch: 5462/7387, Loss=4.1306, lr=0.0000043 Thoughput=44.27 samples/s INFO:gluonnlp:11:26:53 Batch: 5512/7387, Loss=4.0910, lr=0.0000042 Thoughput=44.97 samples/s INFO:gluonnlp:11:27:06 Batch: 5562/7387, Loss=4.0427, lr=0.0000041 Thoughput=46.23 samples/s INFO:gluonnlp:11:27:19 Batch: 5612/7387, Loss=4.0804, lr=0.0000040 Thoughput=45.94 samples/s INFO:gluonnlp:11:27:33 Batch: 5662/7387, Loss=4.0750, lr=0.0000039 Thoughput=44.74 samples/s INFO:gluonnlp:11:27:46 Batch: 5712/7387, Loss=4.1347, lr=0.0000038 Thoughput=45.34 samples/s INFO:gluonnlp:11:27:59 Batch: 5762/7387, Loss=4.1372, lr=0.0000037 Thoughput=45.17 samples/s INFO:gluonnlp:11:28:13 Batch: 5812/7387, Loss=4.0193, lr=0.0000035 Thoughput=43.86 samples/s INFO:gluonnlp:11:28:27 Batch: 5862/7387, Loss=4.2086, lr=0.0000034 Thoughput=43.70 samples/s INFO:gluonnlp:11:28:40 Batch: 5912/7387, Loss=4.1591, lr=0.0000033 Thoughput=45.01 samples/s INFO:gluonnlp:11:28:54 Batch: 5962/7387, Loss=4.1108, lr=0.0000032 Thoughput=44.07 samples/s INFO:gluonnlp:11:29:07 Batch: 6012/7387, Loss=4.1279, lr=0.0000031 Thoughput=46.01 samples/s INFO:gluonnlp:11:29:20 Batch: 6062/7387, Loss=4.1495, lr=0.0000030 Thoughput=44.55 samples/s INFO:gluonnlp:11:29:33 Batch: 6112/7387, Loss=4.1213, lr=0.0000029 Thoughput=44.86 samples/s INFO:gluonnlp:11:29:46 Batch: 6162/7387, Loss=4.0740, lr=0.0000028 Thoughput=46.45 samples/s INFO:gluonnlp:11:30:00 Batch: 6212/7387, Loss=4.1713, lr=0.0000026 Thoughput=43.52 samples/s INFO:gluonnlp:11:30:14 Batch: 6262/7387, Loss=4.1564, lr=0.0000025 Thoughput=43.98 samples/s INFO:gluonnlp:11:30:27 Batch: 6312/7387, Loss=4.0892, lr=0.0000024 Thoughput=45.32 samples/s INFO:gluonnlp:11:30:40 Batch: 6362/7387, Loss=4.0660, lr=0.0000023 Thoughput=46.31 samples/s INFO:gluonnlp:11:30:53 Batch: 6412/7387, Loss=4.0736, lr=0.0000022 Thoughput=46.32 samples/s INFO:gluonnlp:11:31:06 Batch: 6462/7387, Loss=4.0536, lr=0.0000021 Thoughput=44.83 samples/s INFO:gluonnlp:11:31:20 Batch: 6512/7387, Loss=4.2200, lr=0.0000020 Thoughput=43.86 samples/s INFO:gluonnlp:11:31:33 Batch: 6562/7387, Loss=4.1070, lr=0.0000019 Thoughput=45.22 samples/s INFO:gluonnlp:11:31:46 Batch: 6612/7387, Loss=4.0595, lr=0.0000017 Thoughput=46.24 samples/s INFO:gluonnlp:11:32:00 Batch: 6662/7387, Loss=4.1146, lr=0.0000016 Thoughput=44.01 samples/s INFO:gluonnlp:11:32:13 Batch: 6712/7387, Loss=4.1076, lr=0.0000015 Thoughput=46.55 samples/s INFO:gluonnlp:11:32:26 Batch: 6762/7387, Loss=4.1171, lr=0.0000014 Thoughput=46.89 samples/s INFO:gluonnlp:11:32:39 Batch: 6812/7387, Loss=4.0516, lr=0.0000013 Thoughput=45.68 samples/s INFO:gluonnlp:11:32:52 Batch: 6862/7387, Loss=4.0758, lr=0.0000012 Thoughput=44.23 samples/s INFO:gluonnlp:11:33:05 Batch: 6912/7387, Loss=4.0915, lr=0.0000011 Thoughput=45.50 samples/s INFO:gluonnlp:11:33:19 Batch: 6962/7387, Loss=4.1415, lr=0.0000010 Thoughput=44.48 samples/s INFO:gluonnlp:11:33:32 Batch: 7012/7387, Loss=4.0714, lr=0.0000008 Thoughput=46.97 samples/s INFO:gluonnlp:11:33:45 Batch: 7062/7387, Loss=4.0710, lr=0.0000007 Thoughput=45.87 samples/s INFO:gluonnlp:11:33:58 Batch: 7112/7387, Loss=4.0817, lr=0.0000006 Thoughput=45.07 samples/s INFO:gluonnlp:11:34:12 Batch: 7162/7387, Loss=4.0979, lr=0.0000005 Thoughput=44.60 samples/s INFO:gluonnlp:11:34:24 Batch: 7212/7387, Loss=4.1037, lr=0.0000004 Thoughput=46.83 samples/s INFO:gluonnlp:11:34:39 Batch: 7262/7387, Loss=4.0831, lr=0.0000003 Thoughput=41.88 samples/s INFO:gluonnlp:11:34:52 Batch: 7312/7387, Loss=4.0656, lr=0.0000002 Thoughput=44.57 samples/s INFO:gluonnlp:11:35:05 Batch: 7362/7387, Loss=4.0253, lr=0.0000001 Thoughput=45.51 samples/s INFO:gluonnlp:11:35:12 Finish training step: 14773 INFO:gluonnlp:11:35:12 Time cost=3967.29 s, Thoughput=44.68 samples/s INFO:gluonnlp:11:35:15 Loading dev data... INFO:gluonnlp:11:35:15 Number of records in dev data:10570 Done! Transform dataset costs 4.27 seconds. INFO:gluonnlp:11:35:24 The number of examples after preprocessing:10833 Done! Transform dataset costs 4.35 seconds. INFO:gluonnlp:11:35:24 start prediction INFO:gluonnlp:11:36:25 Time cost=60.60 s, Thoughput=178.75 samples/s INFO:gluonnlp:11:36:25 Get prediction results... INFO:gluonnlp:11:36:54 {'exact_match': 5.931882686849574, 'f1': 13.615304984378835}
(Paste the complete error message, including stack trace.)
python finetune_squad.py --optimizer adam --batch_size 12 --lr 3e-5 --epochs 2 --gpu
None
$ pip list Package Version ----------- ------------------- certifi 2019.11.28 chardet 3.0.4 Cython 0.29.16 gluonnlp 0.9.1 graphviz 0.8.4 idna 2.8 mxnet-cu100 1.6.0b20200302 numpy 1.18.1 packaging 20.3 pip 19.3.1 pyparsing 2.4.7 requests 2.22.0 setuptools 44.0.0.post20200106 six 1.14.0 urllib3 1.25.7 wheel 0.33.6
Description
Error Message
(Paste the complete error message, including stack trace.)
To Reproduce
python finetune_squad.py --optimizer adam --batch_size 12 --lr 3e-5 --epochs 2 --gpu
What have you tried to solve it?
None
Environment