flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.89k stars 2.1k forks source link

The size of tensor a (759) must match the size of tensor b (512) at non-singleton dimension 1 #2081

Closed abeermohamed1 closed 3 years ago

abeermohamed1 commented 3 years ago

Describe the bug error after finishing the learning, that The size of tensor a (759) must match the size of tensor b (512) at non-singleton dimension 1 To Reproduce from flair.data import Corpus import gensim

from flair.embeddings import FastTextEmbeddings,TokenEmbeddings, CharacterEmbeddings, WordEmbeddings, StackedEmbeddings, PooledFlairEmbeddings,FlairEmbeddings,DocumentPoolEmbeddings,XLNetEmbeddings, TransformerWordEmbeddings from typing import List from flair.data import Dictionary from flair.models import LanguageModel from flair.trainers.language_model_trainer import LanguageModelTrainer, TextCorpus from flair.data import Corpus from flair.datasets import ColumnCorpus from flair.embeddings import BertEmbeddings import re import numpy as np from nltk import ngrams

define columns

columns = {0: 'text', 1: 'ner'}

this is the folder in which train, test and dev files reside

data_folder = '/content/gdrive/My Drive/resources/tasks/conll_03'

Mohamed Hosny drive

data_folder = '/content/gdrive/My Drive/conll_03'

init a corpus using column format, data folder and the names of the train, dev and test files

corpus: Corpus = ColumnCorpus(data_folder, columns, train_file='train.txt', test_file='ANERCorp1.txt', dev_file='dev.txt', document_separator_token='.')

#document_separator_token

2. what tag do we want to predict?

tag_type = 'ner'

3. make the tag dictionary from the corpus

tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)

embedding_types: List[TokenEmbeddings] = [

   BertEmbeddings(bert_model_or_path = data_folder),

WordEmbeddings('ar'),
# contextual string embeddings, forward
PooledFlairEmbeddings('ar-forward',  pooling= 'mean'),
# contextual string embeddings, backward
PooledFlairEmbeddings('ar-backward' ,  pooling= 'mean'),

]

embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)

initialize sequence tagger

from flair.models import SequenceTagger

tagger: SequenceTagger = SequenceTagger(hidden_size=256, embeddings=embeddings, tag_dictionary=tag_dictionary, tag_type=tag_type )

initialize trainer

from flair.trainers import ModelTrainer from torch.optim.adam import Adam trainer: ModelTrainer = ModelTrainer(tagger, corpus)

import pickle

trainer.train('/content/gdrive/My Drive/resources/taggers/example-ner', train_with_dev=True, learning_rate=0.1,
mini_batch_size=16, max_epochs=150, checkpoint=True)

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

Additional context 2021-01-25 10:20:06,873 Reading data from /content/gdrive/My Drive/conll_03 2021-01-25 10:20:06,874 Train: /content/gdrive/My Drive/conll_03/train.txt 2021-01-25 10:20:06,875 Dev: /content/gdrive/My Drive/conll_03/dev.txt 2021-01-25 10:20:06,877 Test: /content/gdrive/My Drive/conll_03/ANERCorp1.txt /usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:51: DeprecationWarning: Call to deprecated method init. (Use 'TransformerWordEmbeddings' for all transformer-based word embeddings) -- Deprecated since version 0.4.5. 2021-01-25 10:20:23,186 https://flair.informatik.hu-berlin.de/resources/embeddings/token/ar-wiki-fasttext-300d-1M.vectors.npy not found in cache, downloading to /tmp/tmpczjlv6g1 100%|██████████| 733171328/733171328 [00:39<00:00, 18608720.99B/s]2021-01-25 10:21:03,151 copying /tmp/tmpczjlv6g1 to cache at /root/.flair/embeddings/ar-wiki-fasttext-300d-1M.vectors.npy

2021-01-25 10:21:04,898 removing temp file /tmp/tmpczjlv6g1 2021-01-25 10:21:05,622 https://flair.informatik.hu-berlin.de/resources/embeddings/token/ar-wiki-fasttext-300d-1M not found in cache, downloading to /tmp/tmp080tb_mh 100%|██████████| 26704903/26704903 [00:02<00:00, 9476531.62B/s]2021-01-25 10:21:08,904 copying /tmp/tmp080tb_mh to cache at /root/.flair/embeddings/ar-wiki-fasttext-300d-1M 2021-01-25 10:21:08,939 removing temp file /tmp/tmp080tb_mh

2021-01-25 10:21:12,601 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-ar-opus-large-forward-v0.1.pt not found in cache, downloading to /tmp/tmp3m378k3p 100%|██████████| 131796801/131796801 [00:08<00:00, 16369873.12B/s]2021-01-25 10:21:21,166 copying /tmp/tmp3m378k3p to cache at /root/.flair/embeddings/lm-ar-opus-large-forward-v0.1.pt

2021-01-25 10:21:21,322 removing temp file /tmp/tmp3m378k3p 2021-01-25 10:21:41,045 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-ar-opus-large-backward-v0.1.pt not found in cache, downloading to /tmp/tmplcvwbl5h 100%|██████████| 131796811/131796811 [00:08<00:00, 16242622.50B/s]2021-01-25 10:21:49,763 copying /tmp/tmplcvwbl5h to cache at /root/.flair/embeddings/lm-ar-opus-large-backward-v0.1.pt

2021-01-25 10:21:49,912 removing temp file /tmp/tmplcvwbl5h 2021-01-25 10:21:56,225 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:21:56,229 Model: "SequenceTagger( (embeddings): StackedEmbeddings( (list_embedding_0): BertEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64000, 768, padding_idx=0) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (1): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (2): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (3): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (4): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (5): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (6): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (7): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (8): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (9): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (10): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (11): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (list_embedding_1): WordEmbeddings('ar') (list_embedding_2): PooledFlairEmbeddings( (context_embeddings): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.1, inplace=False) (encoder): Embedding(7125, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=7125, bias=True) ) ) ) (list_embedding_3): PooledFlairEmbeddings( (context_embeddings): FlairEmbeddings( (lm): LanguageModel( (drop): Dropout(p=0.1, inplace=False) (encoder): Embedding(7125, 100) (rnn): LSTM(100, 2048) (decoder): Linear(in_features=2048, out_features=7125, bias=True) ) ) ) ) (word_dropout): WordDropout(p=0.05) (locked_dropout): LockedDropout(p=0.5) (embedding2nn): Linear(in_features=11564, out_features=11564, bias=True) (rnn): LSTM(11564, 256, batch_first=True, bidirectional=True) (linear): Linear(in_features=512, out_features=31, bias=True) (beta): 1.0 (weights): None (weight_tensor) None )" 2021-01-25 10:21:56,231 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:21:56,232 Corpus: "Corpus: 1148 train + 613 dev + 4686 test sentences" 2021-01-25 10:21:56,233 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:21:56,239 Parameters: 2021-01-25 10:21:56,242 - learning_rate: "0.1" 2021-01-25 10:21:56,243 - mini_batch_size: "16" 2021-01-25 10:21:56,246 - patience: "3" 2021-01-25 10:21:56,248 - anneal_factor: "0.5" 2021-01-25 10:21:56,252 - max_epochs: "150" 2021-01-25 10:21:56,253 - shuffle: "True" 2021-01-25 10:21:56,254 - train_with_dev: "True" 2021-01-25 10:21:56,256 - batch_growth_annealing: "False" 2021-01-25 10:21:56,258 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:21:56,260 Model training base path: "/content/gdrive/My Drive/resources/taggers/example-ner" 2021-01-25 10:21:56,262 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:21:56,263 Device: cuda:0 2021-01-25 10:21:56,264 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:21:56,266 Embeddings storage mode: cpu 2021-01-25 10:21:56,569 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:22:14,292 epoch 1 - iter 11/111 - loss 54.08059033 - samples/sec: 9.93 - lr: 0.100000 2021-01-25 10:22:30,838 epoch 1 - iter 22/111 - loss 41.13435294 - samples/sec: 10.64 - lr: 0.100000 2021-01-25 10:22:48,765 epoch 1 - iter 33/111 - loss 34.84662836 - samples/sec: 9.82 - lr: 0.100000 2021-01-25 10:23:04,655 epoch 1 - iter 44/111 - loss 30.89554373 - samples/sec: 11.08 - lr: 0.100000 2021-01-25 10:23:22,280 epoch 1 - iter 55/111 - loss 28.54707994 - samples/sec: 9.99 - lr: 0.100000 2021-01-25 10:23:37,139 epoch 1 - iter 66/111 - loss 26.31969578 - samples/sec: 11.85 - lr: 0.100000 2021-01-25 10:23:52,624 epoch 1 - iter 77/111 - loss 24.63814312 - samples/sec: 11.37 - lr: 0.100000 2021-01-25 10:24:08,028 epoch 1 - iter 88/111 - loss 23.34882317 - samples/sec: 11.43 - lr: 0.100000 2021-01-25 10:24:24,155 epoch 1 - iter 99/111 - loss 22.30208344 - samples/sec: 10.91 - lr: 0.100000 2021-01-25 10:24:40,544 epoch 1 - iter 110/111 - loss 21.48262499 - samples/sec: 10.74 - lr: 0.100000 2021-01-25 10:24:40,849 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:24:40,850 EPOCH 1 done: loss 21.3975 - lr 0.1000000 2021-01-25 10:24:40,851 BAD EPOCHS (no improvement): 0 2021-01-25 10:25:57,186 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:26:03,185 epoch 2 - iter 11/111 - loss 10.18805027 - samples/sec: 29.88 - lr: 0.100000 2021-01-25 10:26:09,759 epoch 2 - iter 22/111 - loss 10.30833953 - samples/sec: 26.78 - lr: 0.100000 2021-01-25 10:26:15,238 epoch 2 - iter 33/111 - loss 10.55462659 - samples/sec: 32.14 - lr: 0.100000 2021-01-25 10:26:20,959 epoch 2 - iter 44/111 - loss 10.55391189 - samples/sec: 30.78 - lr: 0.100000 2021-01-25 10:26:28,486 epoch 2 - iter 55/111 - loss 11.07268034 - samples/sec: 23.39 - lr: 0.100000 2021-01-25 10:26:35,605 epoch 2 - iter 66/111 - loss 10.69464526 - samples/sec: 24.73 - lr: 0.100000 2021-01-25 10:26:41,533 epoch 2 - iter 77/111 - loss 10.97407639 - samples/sec: 29.70 - lr: 0.100000 2021-01-25 10:26:48,341 epoch 2 - iter 88/111 - loss 10.84447764 - samples/sec: 25.86 - lr: 0.100000 2021-01-25 10:26:54,238 epoch 2 - iter 99/111 - loss 10.78961229 - samples/sec: 29.86 - lr: 0.100000 2021-01-25 10:27:00,189 epoch 2 - iter 110/111 - loss 10.77094948 - samples/sec: 29.58 - lr: 0.100000 2021-01-25 10:27:00,244 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:27:00,246 EPOCH 2 done: loss 10.6942 - lr 0.1000000 2021-01-25 10:27:00,247 BAD EPOCHS (no improvement): 0 2021-01-25 10:27:32,723 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:27:48,364 epoch 3 - iter 11/111 - loss 9.19014961 - samples/sec: 27.02 - lr: 0.100000 2021-01-25 10:27:54,107 epoch 3 - iter 22/111 - loss 10.33909940 - samples/sec: 30.66 - lr: 0.100000 2021-01-25 10:28:00,251 epoch 3 - iter 33/111 - loss 9.59317412 - samples/sec: 28.66 - lr: 0.100000 2021-01-25 10:28:06,546 epoch 3 - iter 44/111 - loss 9.39935520 - samples/sec: 27.97 - lr: 0.100000 2021-01-25 10:28:12,817 epoch 3 - iter 55/111 - loss 9.47141300 - samples/sec: 28.08 - lr: 0.100000 2021-01-25 10:28:18,610 epoch 3 - iter 66/111 - loss 9.18817192 - samples/sec: 30.39 - lr: 0.100000 2021-01-25 10:28:23,990 epoch 3 - iter 77/111 - loss 8.91675316 - samples/sec: 32.73 - lr: 0.100000 2021-01-25 10:28:29,217 epoch 3 - iter 88/111 - loss 8.72033167 - samples/sec: 33.68 - lr: 0.100000 2021-01-25 10:28:36,200 epoch 3 - iter 99/111 - loss 8.84041737 - samples/sec: 25.21 - lr: 0.100000 2021-01-25 10:28:42,426 epoch 3 - iter 110/111 - loss 8.67852450 - samples/sec: 28.28 - lr: 0.100000 2021-01-25 10:28:42,612 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:28:42,614 EPOCH 3 done: loss 8.6985 - lr 0.1000000 2021-01-25 10:28:42,615 BAD EPOCHS (no improvement): 0 2021-01-25 10:29:23,268 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:29:29,888 epoch 4 - iter 11/111 - loss 6.79467323 - samples/sec: 26.94 - lr: 0.100000 2021-01-25 10:29:36,591 epoch 4 - iter 22/111 - loss 6.96387667 - samples/sec: 26.26 - lr: 0.100000 2021-01-25 10:29:42,767 epoch 4 - iter 33/111 - loss 7.02527775 - samples/sec: 28.51 - lr: 0.100000 2021-01-25 10:29:48,930 epoch 4 - iter 44/111 - loss 7.16199169 - samples/sec: 28.57 - lr: 0.100000 2021-01-25 10:29:54,445 epoch 4 - iter 55/111 - loss 6.97783715 - samples/sec: 31.93 - lr: 0.100000 2021-01-25 10:30:00,205 epoch 4 - iter 66/111 - loss 7.18816653 - samples/sec: 30.57 - lr: 0.100000 2021-01-25 10:30:05,510 epoch 4 - iter 77/111 - loss 7.00425652 - samples/sec: 33.19 - lr: 0.100000 2021-01-25 10:30:11,261 epoch 4 - iter 88/111 - loss 6.98279506 - samples/sec: 30.61 - lr: 0.100000 2021-01-25 10:30:16,854 epoch 4 - iter 99/111 - loss 7.10576206 - samples/sec: 31.48 - lr: 0.100000 2021-01-25 10:30:24,373 epoch 4 - iter 110/111 - loss 7.01915045 - samples/sec: 23.42 - lr: 0.100000 2021-01-25 10:30:24,459 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:30:24,460 EPOCH 4 done: loss 6.9768 - lr 0.1000000 2021-01-25 10:30:24,462 BAD EPOCHS (no improvement): 0 2021-01-25 10:31:06,044 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:31:13,480 epoch 5 - iter 11/111 - loss 7.12273786 - samples/sec: 23.88 - lr: 0.100000 2021-01-25 10:31:19,017 epoch 5 - iter 22/111 - loss 6.78446281 - samples/sec: 31.80 - lr: 0.100000 2021-01-25 10:31:26,232 epoch 5 - iter 33/111 - loss 6.65425849 - samples/sec: 24.40 - lr: 0.100000 2021-01-25 10:31:32,082 epoch 5 - iter 44/111 - loss 6.80952236 - samples/sec: 30.10 - lr: 0.100000 2021-01-25 10:31:38,110 epoch 5 - iter 55/111 - loss 7.03379300 - samples/sec: 29.21 - lr: 0.100000 2021-01-25 10:31:44,252 epoch 5 - iter 66/111 - loss 7.01593109 - samples/sec: 28.66 - lr: 0.100000 2021-01-25 10:31:50,285 epoch 5 - iter 77/111 - loss 6.78503151 - samples/sec: 29.19 - lr: 0.100000 2021-01-25 10:31:56,528 epoch 5 - iter 88/111 - loss 6.68158661 - samples/sec: 28.20 - lr: 0.100000 2021-01-25 10:32:02,192 epoch 5 - iter 99/111 - loss 6.57793334 - samples/sec: 31.09 - lr: 0.100000 2021-01-25 10:32:08,272 epoch 5 - iter 110/111 - loss 6.51472998 - samples/sec: 28.96 - lr: 0.100000 2021-01-25 10:32:08,437 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:32:08,439 EPOCH 5 done: loss 6.5798 - lr 0.1000000 2021-01-25 10:32:08,440 BAD EPOCHS (no improvement): 0 2021-01-25 10:32:45,837 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:32:54,629 epoch 6 - iter 11/111 - loss 5.72094809 - samples/sec: 32.25 - lr: 0.100000 2021-01-25 10:33:00,130 epoch 6 - iter 22/111 - loss 5.98198058 - samples/sec: 32.01 - lr: 0.100000 2021-01-25 10:33:07,003 epoch 6 - iter 33/111 - loss 6.39245874 - samples/sec: 25.62 - lr: 0.100000 2021-01-25 10:33:12,270 epoch 6 - iter 44/111 - loss 5.99628868 - samples/sec: 33.43 - lr: 0.100000 2021-01-25 10:33:18,420 epoch 6 - iter 55/111 - loss 5.95702457 - samples/sec: 28.63 - lr: 0.100000 2021-01-25 10:33:24,818 epoch 6 - iter 66/111 - loss 6.05384102 - samples/sec: 27.52 - lr: 0.100000 2021-01-25 10:33:30,321 epoch 6 - iter 77/111 - loss 5.79583566 - samples/sec: 32.00 - lr: 0.100000 2021-01-25 10:33:37,268 epoch 6 - iter 88/111 - loss 5.77474942 - samples/sec: 25.34 - lr: 0.100000 2021-01-25 10:33:43,563 epoch 6 - iter 99/111 - loss 5.75745162 - samples/sec: 27.97 - lr: 0.100000 2021-01-25 10:33:50,255 epoch 6 - iter 110/111 - loss 5.89956489 - samples/sec: 26.31 - lr: 0.100000 2021-01-25 10:33:50,368 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:33:50,370 EPOCH 6 done: loss 5.8923 - lr 0.1000000 2021-01-25 10:33:50,371 BAD EPOCHS (no improvement): 0 2021-01-25 10:34:21,739 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:34:27,150 epoch 7 - iter 11/111 - loss 5.71705688 - samples/sec: 32.98 - lr: 0.100000 2021-01-25 10:34:36,615 epoch 7 - iter 22/111 - loss 6.48266622 - samples/sec: 24.78 - lr: 0.100000 2021-01-25 10:34:43,626 epoch 7 - iter 33/111 - loss 6.74974753 - samples/sec: 25.11 - lr: 0.100000 2021-01-25 10:34:49,860 epoch 7 - iter 44/111 - loss 6.39683523 - samples/sec: 28.24 - lr: 0.100000 2021-01-25 10:34:55,808 epoch 7 - iter 55/111 - loss 6.05628885 - samples/sec: 29.60 - lr: 0.100000 2021-01-25 10:35:01,487 epoch 7 - iter 66/111 - loss 5.87755394 - samples/sec: 31.01 - lr: 0.100000 2021-01-25 10:35:07,559 epoch 7 - iter 77/111 - loss 5.69736838 - samples/sec: 29.00 - lr: 0.100000 2021-01-25 10:35:14,293 epoch 7 - iter 88/111 - loss 5.60839416 - samples/sec: 26.14 - lr: 0.100000 2021-01-25 10:35:20,886 epoch 7 - iter 99/111 - loss 5.64187315 - samples/sec: 26.70 - lr: 0.100000 2021-01-25 10:35:26,770 epoch 7 - iter 110/111 - loss 5.51890169 - samples/sec: 29.93 - lr: 0.100000 2021-01-25 10:35:26,939 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:35:26,941 EPOCH 7 done: loss 5.4743 - lr 0.1000000 2021-01-25 10:35:26,943 BAD EPOCHS (no improvement): 0 2021-01-25 10:36:04,765 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:36:11,157 epoch 8 - iter 11/111 - loss 4.83452242 - samples/sec: 27.81 - lr: 0.100000 2021-01-25 10:36:17,406 epoch 8 - iter 22/111 - loss 4.57567427 - samples/sec: 28.18 - lr: 0.100000 2021-01-25 10:36:24,121 epoch 8 - iter 33/111 - loss 4.59209776 - samples/sec: 26.22 - lr: 0.100000 2021-01-25 10:36:29,858 epoch 8 - iter 44/111 - loss 4.95231758 - samples/sec: 30.69 - lr: 0.100000 2021-01-25 10:36:35,865 epoch 8 - iter 55/111 - loss 4.92243188 - samples/sec: 29.31 - lr: 0.100000 2021-01-25 10:36:42,722 epoch 8 - iter 66/111 - loss 4.97758638 - samples/sec: 25.68 - lr: 0.100000 2021-01-25 10:36:48,853 epoch 8 - iter 77/111 - loss 4.96645152 - samples/sec: 28.72 - lr: 0.100000 2021-01-25 10:36:53,957 epoch 8 - iter 88/111 - loss 4.87177904 - samples/sec: 34.50 - lr: 0.100000 2021-01-25 10:37:00,283 epoch 8 - iter 99/111 - loss 4.99280324 - samples/sec: 27.83 - lr: 0.100000 2021-01-25 10:37:07,721 epoch 8 - iter 110/111 - loss 4.97948530 - samples/sec: 23.67 - lr: 0.100000 2021-01-25 10:37:07,829 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:37:07,830 EPOCH 8 done: loss 4.9616 - lr 0.1000000 2021-01-25 10:37:07,831 BAD EPOCHS (no improvement): 0 2021-01-25 10:37:48,631 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:37:54,758 epoch 9 - iter 11/111 - loss 4.03332498 - samples/sec: 28.96 - lr: 0.100000 2021-01-25 10:38:00,884 epoch 9 - iter 22/111 - loss 4.12423093 - samples/sec: 28.75 - lr: 0.100000 2021-01-25 10:38:06,179 epoch 9 - iter 33/111 - loss 3.96145048 - samples/sec: 33.25 - lr: 0.100000 2021-01-25 10:38:12,600 epoch 9 - iter 44/111 - loss 4.27605110 - samples/sec: 27.42 - lr: 0.100000 2021-01-25 10:38:19,267 epoch 9 - iter 55/111 - loss 4.75408689 - samples/sec: 26.41 - lr: 0.100000 2021-01-25 10:38:26,074 epoch 9 - iter 66/111 - loss 5.00143290 - samples/sec: 25.87 - lr: 0.100000 2021-01-25 10:38:32,395 epoch 9 - iter 77/111 - loss 4.87554965 - samples/sec: 27.85 - lr: 0.100000 2021-01-25 10:38:38,356 epoch 9 - iter 88/111 - loss 4.95585848 - samples/sec: 29.54 - lr: 0.100000 2021-01-25 10:38:44,373 epoch 9 - iter 99/111 - loss 4.92190668 - samples/sec: 29.26 - lr: 0.100000 2021-01-25 10:38:50,575 epoch 9 - iter 110/111 - loss 4.84930390 - samples/sec: 28.39 - lr: 0.100000 2021-01-25 10:38:50,674 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:38:50,675 EPOCH 9 done: loss 4.8162 - lr 0.1000000 2021-01-25 10:38:50,677 BAD EPOCHS (no improvement): 0 2021-01-25 10:39:30,572 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:39:36,829 epoch 10 - iter 11/111 - loss 4.69822099 - samples/sec: 28.33 - lr: 0.100000 2021-01-25 10:39:43,649 epoch 10 - iter 22/111 - loss 5.36886237 - samples/sec: 25.82 - lr: 0.100000 2021-01-25 10:39:50,021 epoch 10 - iter 33/111 - loss 4.81391519 - samples/sec: 27.63 - lr: 0.100000 2021-01-25 10:39:56,348 epoch 10 - iter 44/111 - loss 4.79041301 - samples/sec: 27.83 - lr: 0.100000 2021-01-25 10:40:03,125 epoch 10 - iter 55/111 - loss 4.67135966 - samples/sec: 25.98 - lr: 0.100000 2021-01-25 10:40:10,932 epoch 10 - iter 66/111 - loss 4.70814198 - samples/sec: 22.55 - lr: 0.100000 2021-01-25 10:40:16,611 epoch 10 - iter 77/111 - loss 4.64808380 - samples/sec: 31.01 - lr: 0.100000 2021-01-25 10:40:22,158 epoch 10 - iter 88/111 - loss 4.64051295 - samples/sec: 31.74 - lr: 0.100000 2021-01-25 10:40:28,254 epoch 10 - iter 99/111 - loss 4.56246087 - samples/sec: 28.88 - lr: 0.100000 2021-01-25 10:40:33,677 epoch 10 - iter 110/111 - loss 4.61198434 - samples/sec: 32.47 - lr: 0.100000 2021-01-25 10:40:33,770 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:40:33,772 EPOCH 10 done: loss 4.5847 - lr 0.1000000 2021-01-25 10:40:33,774 BAD EPOCHS (no improvement): 0 2021-01-25 10:41:06,168 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:41:12,392 epoch 11 - iter 11/111 - loss 4.56382496 - samples/sec: 28.49 - lr: 0.100000 2021-01-25 10:41:21,588 epoch 11 - iter 22/111 - loss 4.40316660 - samples/sec: 27.02 - lr: 0.100000 2021-01-25 10:41:27,786 epoch 11 - iter 33/111 - loss 4.20600127 - samples/sec: 28.41 - lr: 0.100000 2021-01-25 10:41:33,650 epoch 11 - iter 44/111 - loss 4.13510584 - samples/sec: 30.02 - lr: 0.100000 2021-01-25 10:41:40,178 epoch 11 - iter 55/111 - loss 4.18275395 - samples/sec: 26.97 - lr: 0.100000 2021-01-25 10:41:46,411 epoch 11 - iter 66/111 - loss 4.46802516 - samples/sec: 28.25 - lr: 0.100000 2021-01-25 10:41:52,223 epoch 11 - iter 77/111 - loss 4.53560302 - samples/sec: 30.30 - lr: 0.100000 2021-01-25 10:41:57,900 epoch 11 - iter 88/111 - loss 4.39049167 - samples/sec: 31.02 - lr: 0.100000 2021-01-25 10:42:04,240 epoch 11 - iter 99/111 - loss 4.37021638 - samples/sec: 27.77 - lr: 0.100000 2021-01-25 10:42:10,404 epoch 11 - iter 110/111 - loss 4.53993713 - samples/sec: 28.56 - lr: 0.100000 2021-01-25 10:42:10,496 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:42:10,498 EPOCH 11 done: loss 4.5023 - lr 0.1000000 2021-01-25 10:42:10,499 BAD EPOCHS (no improvement): 0 2021-01-25 10:42:41,977 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:42:47,948 epoch 12 - iter 11/111 - loss 5.90833831 - samples/sec: 29.72 - lr: 0.100000 2021-01-25 10:42:54,814 epoch 12 - iter 22/111 - loss 4.99683451 - samples/sec: 33.32 - lr: 0.100000 2021-01-25 10:43:00,703 epoch 12 - iter 33/111 - loss 4.87504964 - samples/sec: 29.90 - lr: 0.100000 2021-01-25 10:43:06,928 epoch 12 - iter 44/111 - loss 4.67687361 - samples/sec: 28.29 - lr: 0.100000 2021-01-25 10:43:13,530 epoch 12 - iter 55/111 - loss 4.54725967 - samples/sec: 26.67 - lr: 0.100000 2021-01-25 10:43:19,080 epoch 12 - iter 66/111 - loss 4.61625748 - samples/sec: 31.72 - lr: 0.100000 2021-01-25 10:43:25,690 epoch 12 - iter 77/111 - loss 4.88363346 - samples/sec: 26.64 - lr: 0.100000 2021-01-25 10:43:32,796 epoch 12 - iter 88/111 - loss 4.97682306 - samples/sec: 24.78 - lr: 0.100000 2021-01-25 10:43:38,314 epoch 12 - iter 99/111 - loss 4.99162579 - samples/sec: 31.91 - lr: 0.100000 2021-01-25 10:43:43,992 epoch 12 - iter 110/111 - loss 4.87017690 - samples/sec: 31.01 - lr: 0.100000 2021-01-25 10:43:44,166 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:43:44,167 EPOCH 12 done: loss 4.9315 - lr 0.1000000 2021-01-25 10:43:44,169 BAD EPOCHS (no improvement): 1 2021-01-25 10:44:24,274 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:44:32,223 epoch 13 - iter 11/111 - loss 4.49722214 - samples/sec: 22.29 - lr: 0.100000 2021-01-25 10:44:39,436 epoch 13 - iter 22/111 - loss 4.24567978 - samples/sec: 24.41 - lr: 0.100000 2021-01-25 10:44:45,070 epoch 13 - iter 33/111 - loss 4.07326443 - samples/sec: 31.25 - lr: 0.100000 2021-01-25 10:44:50,669 epoch 13 - iter 44/111 - loss 4.08721220 - samples/sec: 31.45 - lr: 0.100000 2021-01-25 10:44:56,673 epoch 13 - iter 55/111 - loss 4.13151048 - samples/sec: 29.33 - lr: 0.100000 2021-01-25 10:45:02,068 epoch 13 - iter 66/111 - loss 4.30481188 - samples/sec: 32.64 - lr: 0.100000 2021-01-25 10:45:08,192 epoch 13 - iter 77/111 - loss 4.50275675 - samples/sec: 28.75 - lr: 0.100000 2021-01-25 10:45:14,369 epoch 13 - iter 88/111 - loss 4.42745535 - samples/sec: 28.50 - lr: 0.100000 2021-01-25 10:45:19,504 epoch 13 - iter 99/111 - loss 4.38588194 - samples/sec: 34.29 - lr: 0.100000 2021-01-25 10:45:26,292 epoch 13 - iter 110/111 - loss 4.41341555 - samples/sec: 25.94 - lr: 0.100000 2021-01-25 10:45:26,421 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:45:26,424 EPOCH 13 done: loss 4.4409 - lr 0.1000000 2021-01-25 10:45:26,425 BAD EPOCHS (no improvement): 0 2021-01-25 10:46:03,627 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:46:09,385 epoch 14 - iter 11/111 - loss 3.34990287 - samples/sec: 30.81 - lr: 0.100000 2021-01-25 10:46:14,989 epoch 14 - iter 22/111 - loss 3.77477489 - samples/sec: 31.42 - lr: 0.100000 2021-01-25 10:46:21,424 epoch 14 - iter 33/111 - loss 4.09334437 - samples/sec: 27.36 - lr: 0.100000 2021-01-25 10:46:27,488 epoch 14 - iter 44/111 - loss 3.97580681 - samples/sec: 29.04 - lr: 0.100000 2021-01-25 10:46:33,044 epoch 14 - iter 55/111 - loss 3.98205665 - samples/sec: 31.69 - lr: 0.100000 2021-01-25 10:46:39,999 epoch 14 - iter 66/111 - loss 4.15114326 - samples/sec: 25.31 - lr: 0.100000 2021-01-25 10:46:46,613 epoch 14 - iter 77/111 - loss 4.07458371 - samples/sec: 26.62 - lr: 0.100000 2021-01-25 10:46:53,154 epoch 14 - iter 88/111 - loss 4.09190993 - samples/sec: 26.91 - lr: 0.100000 2021-01-25 10:46:59,415 epoch 14 - iter 99/111 - loss 4.33675907 - samples/sec: 28.12 - lr: 0.100000 2021-01-25 10:47:05,512 epoch 14 - iter 110/111 - loss 4.24677618 - samples/sec: 28.88 - lr: 0.100000 2021-01-25 10:47:05,604 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:47:05,605 EPOCH 14 done: loss 4.2480 - lr 0.1000000 2021-01-25 10:47:05,607 BAD EPOCHS (no improvement): 0 2021-01-25 10:47:39,266 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:47:45,746 epoch 15 - iter 11/111 - loss 4.84905434 - samples/sec: 27.40 - lr: 0.100000 2021-01-25 10:47:52,518 epoch 15 - iter 22/111 - loss 4.42149465 - samples/sec: 26.00 - lr: 0.100000 2021-01-25 10:47:58,257 epoch 15 - iter 33/111 - loss 4.35310003 - samples/sec: 30.68 - lr: 0.100000 2021-01-25 10:48:04,382 epoch 15 - iter 44/111 - loss 4.08566290 - samples/sec: 28.75 - lr: 0.100000 2021-01-25 10:48:10,112 epoch 15 - iter 55/111 - loss 4.01557928 - samples/sec: 30.73 - lr: 0.100000 2021-01-25 10:48:16,130 epoch 15 - iter 66/111 - loss 4.15357811 - samples/sec: 29.26 - lr: 0.100000 2021-01-25 10:48:22,464 epoch 15 - iter 77/111 - loss 4.31850308 - samples/sec: 27.80 - lr: 0.100000 2021-01-25 10:48:29,213 epoch 15 - iter 88/111 - loss 4.32283522 - samples/sec: 26.09 - lr: 0.100000 2021-01-25 10:48:34,778 epoch 15 - iter 99/111 - loss 4.25586665 - samples/sec: 31.64 - lr: 0.100000 2021-01-25 10:48:40,845 epoch 15 - iter 110/111 - loss 4.18545671 - samples/sec: 29.02 - lr: 0.100000 2021-01-25 10:48:40,913 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:48:40,915 EPOCH 15 done: loss 4.1483 - lr 0.1000000 2021-01-25 10:48:40,916 BAD EPOCHS (no improvement): 0 2021-01-25 10:49:23,213 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:49:29,388 epoch 16 - iter 11/111 - loss 4.67221058 - samples/sec: 28.76 - lr: 0.100000 2021-01-25 10:49:34,753 epoch 16 - iter 22/111 - loss 4.13016958 - samples/sec: 32.81 - lr: 0.100000 2021-01-25 10:49:41,447 epoch 16 - iter 33/111 - loss 3.81629994 - samples/sec: 26.31 - lr: 0.100000 2021-01-25 10:49:46,595 epoch 16 - iter 44/111 - loss 3.72972661 - samples/sec: 34.20 - lr: 0.100000 2021-01-25 10:49:53,267 epoch 16 - iter 55/111 - loss 3.82915365 - samples/sec: 26.39 - lr: 0.100000 2021-01-25 10:49:59,078 epoch 16 - iter 66/111 - loss 4.01950283 - samples/sec: 30.30 - lr: 0.100000 2021-01-25 10:50:03,771 epoch 16 - iter 77/111 - loss 3.97010807 - samples/sec: 37.53 - lr: 0.100000 2021-01-25 10:50:10,556 epoch 16 - iter 88/111 - loss 4.07479574 - samples/sec: 25.95 - lr: 0.100000 2021-01-25 10:50:17,219 epoch 16 - iter 99/111 - loss 4.07658210 - samples/sec: 26.43 - lr: 0.100000 2021-01-25 10:50:23,340 epoch 16 - iter 110/111 - loss 4.07201452 - samples/sec: 28.76 - lr: 0.100000 2021-01-25 10:50:23,425 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:50:23,427 EPOCH 16 done: loss 4.0881 - lr 0.1000000 2021-01-25 10:50:23,429 BAD EPOCHS (no improvement): 0 2021-01-25 10:50:55,720 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:51:09,863 epoch 17 - iter 11/111 - loss 3.91294293 - samples/sec: 30.55 - lr: 0.100000 2021-01-25 10:51:15,941 epoch 17 - iter 22/111 - loss 3.95785974 - samples/sec: 28.97 - lr: 0.100000 2021-01-25 10:51:21,702 epoch 17 - iter 33/111 - loss 4.18957638 - samples/sec: 30.56 - lr: 0.100000 2021-01-25 10:51:28,507 epoch 17 - iter 44/111 - loss 4.22830299 - samples/sec: 25.87 - lr: 0.100000 2021-01-25 10:51:34,804 epoch 17 - iter 55/111 - loss 4.08225724 - samples/sec: 27.96 - lr: 0.100000 2021-01-25 10:51:40,796 epoch 17 - iter 66/111 - loss 4.08138589 - samples/sec: 29.39 - lr: 0.100000 2021-01-25 10:51:47,087 epoch 17 - iter 77/111 - loss 3.95975426 - samples/sec: 27.99 - lr: 0.100000 2021-01-25 10:51:54,381 epoch 17 - iter 88/111 - loss 4.03049670 - samples/sec: 24.14 - lr: 0.100000 2021-01-25 10:52:00,314 epoch 17 - iter 99/111 - loss 4.04535589 - samples/sec: 29.68 - lr: 0.100000 2021-01-25 10:52:05,885 epoch 17 - iter 110/111 - loss 3.98108125 - samples/sec: 31.61 - lr: 0.100000 2021-01-25 10:52:05,992 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:52:05,993 EPOCH 17 done: loss 3.9852 - lr 0.1000000 2021-01-25 10:52:05,995 BAD EPOCHS (no improvement): 0 2021-01-25 10:52:41,921 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:52:47,916 epoch 18 - iter 11/111 - loss 4.47138399 - samples/sec: 29.58 - lr: 0.100000 2021-01-25 10:52:53,928 epoch 18 - iter 22/111 - loss 4.02970250 - samples/sec: 29.29 - lr: 0.100000 2021-01-25 10:52:59,772 epoch 18 - iter 33/111 - loss 3.72331607 - samples/sec: 30.13 - lr: 0.100000 2021-01-25 10:53:05,842 epoch 18 - iter 44/111 - loss 3.81717015 - samples/sec: 29.01 - lr: 0.100000 2021-01-25 10:53:12,043 epoch 18 - iter 55/111 - loss 3.94470713 - samples/sec: 28.40 - lr: 0.100000 2021-01-25 10:53:19,594 epoch 18 - iter 66/111 - loss 3.94664986 - samples/sec: 23.32 - lr: 0.100000 2021-01-25 10:53:25,908 epoch 18 - iter 77/111 - loss 4.00742454 - samples/sec: 27.89 - lr: 0.100000 2021-01-25 10:53:32,544 epoch 18 - iter 88/111 - loss 4.13232218 - samples/sec: 26.53 - lr: 0.100000 2021-01-25 10:53:39,192 epoch 18 - iter 99/111 - loss 4.02311627 - samples/sec: 26.48 - lr: 0.100000 2021-01-25 10:53:44,978 epoch 18 - iter 110/111 - loss 3.94061089 - samples/sec: 30.43 - lr: 0.100000 2021-01-25 10:53:45,044 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:53:45,046 EPOCH 18 done: loss 3.9053 - lr 0.1000000 2021-01-25 10:53:45,050 BAD EPOCHS (no improvement): 0 2021-01-25 10:54:18,998 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:54:25,784 epoch 19 - iter 11/111 - loss 3.36324605 - samples/sec: 26.09 - lr: 0.100000 2021-01-25 10:54:33,028 epoch 19 - iter 22/111 - loss 3.85746611 - samples/sec: 24.30 - lr: 0.100000 2021-01-25 10:54:39,061 epoch 19 - iter 33/111 - loss 3.68025916 - samples/sec: 29.19 - lr: 0.100000 2021-01-25 10:54:44,841 epoch 19 - iter 44/111 - loss 3.57081627 - samples/sec: 30.46 - lr: 0.100000 2021-01-25 10:54:50,816 epoch 19 - iter 55/111 - loss 3.79188262 - samples/sec: 29.47 - lr: 0.100000 2021-01-25 10:54:57,148 epoch 19 - iter 66/111 - loss 3.74162868 - samples/sec: 27.80 - lr: 0.100000 2021-01-25 10:55:03,228 epoch 19 - iter 77/111 - loss 3.80293375 - samples/sec: 28.96 - lr: 0.100000 2021-01-25 10:55:09,269 epoch 19 - iter 88/111 - loss 3.83735181 - samples/sec: 29.15 - lr: 0.100000 2021-01-25 10:55:14,984 epoch 19 - iter 99/111 - loss 3.78604484 - samples/sec: 30.81 - lr: 0.100000 2021-01-25 10:55:21,006 epoch 19 - iter 110/111 - loss 3.78007814 - samples/sec: 29.24 - lr: 0.100000 2021-01-25 10:55:21,349 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:55:21,350 EPOCH 19 done: loss 3.9585 - lr 0.1000000 2021-01-25 10:55:21,352 BAD EPOCHS (no improvement): 1 2021-01-25 10:56:01,430 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:56:08,691 epoch 20 - iter 11/111 - loss 3.34550040 - samples/sec: 24.43 - lr: 0.100000 2021-01-25 10:56:15,795 epoch 20 - iter 22/111 - loss 3.64379755 - samples/sec: 24.78 - lr: 0.100000 2021-01-25 10:56:21,410 epoch 20 - iter 33/111 - loss 3.58633875 - samples/sec: 31.36 - lr: 0.100000 2021-01-25 10:56:26,894 epoch 20 - iter 44/111 - loss 3.54973823 - samples/sec: 32.10 - lr: 0.100000 2021-01-25 10:56:33,383 epoch 20 - iter 55/111 - loss 3.67109567 - samples/sec: 27.13 - lr: 0.100000 2021-01-25 10:56:38,796 epoch 20 - iter 66/111 - loss 3.66164948 - samples/sec: 32.53 - lr: 0.100000 2021-01-25 10:56:44,968 epoch 20 - iter 77/111 - loss 3.62982319 - samples/sec: 28.52 - lr: 0.100000 2021-01-25 10:56:51,072 epoch 20 - iter 88/111 - loss 3.64661483 - samples/sec: 28.85 - lr: 0.100000 2021-01-25 10:56:56,235 epoch 20 - iter 99/111 - loss 3.60557880 - samples/sec: 34.11 - lr: 0.100000 2021-01-25 10:57:02,160 epoch 20 - iter 110/111 - loss 3.74337495 - samples/sec: 29.72 - lr: 0.100000 2021-01-25 10:57:02,205 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:57:02,207 EPOCH 20 done: loss 3.7098 - lr 0.1000000 2021-01-25 10:57:02,208 BAD EPOCHS (no improvement): 0 2021-01-25 10:57:40,962 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:57:47,285 epoch 21 - iter 11/111 - loss 3.86662451 - samples/sec: 27.97 - lr: 0.100000 2021-01-25 10:57:52,470 epoch 21 - iter 22/111 - loss 3.40074368 - samples/sec: 33.96 - lr: 0.100000 2021-01-25 10:57:58,417 epoch 21 - iter 33/111 - loss 3.58548474 - samples/sec: 29.60 - lr: 0.100000 2021-01-25 10:58:04,761 epoch 21 - iter 44/111 - loss 3.54917274 - samples/sec: 27.75 - lr: 0.100000 2021-01-25 10:58:10,414 epoch 21 - iter 55/111 - loss 3.48718619 - samples/sec: 31.14 - lr: 0.100000 2021-01-25 10:58:16,269 epoch 21 - iter 66/111 - loss 3.43532407 - samples/sec: 30.07 - lr: 0.100000 2021-01-25 10:58:22,342 epoch 21 - iter 77/111 - loss 3.56432442 - samples/sec: 28.99 - lr: 0.100000 2021-01-25 10:58:28,219 epoch 21 - iter 88/111 - loss 3.45780471 - samples/sec: 29.96 - lr: 0.100000 2021-01-25 10:58:33,373 epoch 21 - iter 99/111 - loss 3.47687646 - samples/sec: 34.16 - lr: 0.100000 2021-01-25 10:58:39,868 epoch 21 - iter 110/111 - loss 3.59270786 - samples/sec: 27.11 - lr: 0.100000 2021-01-25 10:58:39,916 ---------------------------------------------------------------------------------------------------- 2021-01-25 10:58:39,918 EPOCH 21 done: loss 3.5603 - lr 0.1000000 2021-01-25 10:58:39,919 BAD EPOCHS (no improvement): 0 2021-01-25 10:59:17,498 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 10:59:22,994 epoch 22 - iter 11/111 - loss 3.60258175 - samples/sec: 32.21 - lr: 0.100000 2021-01-25 10:59:28,617 epoch 22 - iter 22/111 - loss 3.33457632 - samples/sec: 31.32 - lr: 0.100000 2021-01-25 10:59:34,673 epoch 22 - iter 33/111 - loss 3.52667694 - samples/sec: 29.07 - lr: 0.100000 2021-01-25 10:59:41,549 epoch 22 - iter 44/111 - loss 3.61057627 - samples/sec: 25.60 - lr: 0.100000 2021-01-25 10:59:47,770 epoch 22 - iter 55/111 - loss 3.58373189 - samples/sec: 28.30 - lr: 0.100000 2021-01-25 10:59:54,099 epoch 22 - iter 66/111 - loss 3.81865191 - samples/sec: 27.83 - lr: 0.100000 2021-01-25 10:59:59,891 epoch 22 - iter 77/111 - loss 3.72697158 - samples/sec: 30.40 - lr: 0.100000 2021-01-25 11:00:05,045 epoch 22 - iter 88/111 - loss 3.59734723 - samples/sec: 34.16 - lr: 0.100000 2021-01-25 11:00:11,190 epoch 22 - iter 99/111 - loss 3.58061069 - samples/sec: 28.65 - lr: 0.100000 2021-01-25 11:00:16,912 epoch 22 - iter 110/111 - loss 3.63909051 - samples/sec: 30.77 - lr: 0.100000 2021-01-25 11:00:17,020 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:00:17,022 EPOCH 22 done: loss 3.6081 - lr 0.1000000 2021-01-25 11:00:17,022 BAD EPOCHS (no improvement): 1 2021-01-25 11:00:51,480 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:00:56,588 epoch 23 - iter 11/111 - loss 3.05639503 - samples/sec: 34.66 - lr: 0.100000 2021-01-25 11:01:02,612 epoch 23 - iter 22/111 - loss 3.19333094 - samples/sec: 29.23 - lr: 0.100000 2021-01-25 11:01:08,102 epoch 23 - iter 33/111 - loss 3.29475918 - samples/sec: 32.07 - lr: 0.100000 2021-01-25 11:01:13,807 epoch 23 - iter 44/111 - loss 3.27113770 - samples/sec: 30.86 - lr: 0.100000 2021-01-25 11:01:19,332 epoch 23 - iter 55/111 - loss 3.23012321 - samples/sec: 31.86 - lr: 0.100000 2021-01-25 11:01:24,790 epoch 23 - iter 66/111 - loss 3.30221744 - samples/sec: 32.26 - lr: 0.100000 2021-01-25 11:01:30,897 epoch 23 - iter 77/111 - loss 3.34762002 - samples/sec: 28.83 - lr: 0.100000 2021-01-25 11:01:36,896 epoch 23 - iter 88/111 - loss 3.35990208 - samples/sec: 29.35 - lr: 0.100000 2021-01-25 11:01:42,995 epoch 23 - iter 99/111 - loss 3.52005647 - samples/sec: 28.87 - lr: 0.100000 2021-01-25 11:01:49,448 epoch 23 - iter 110/111 - loss 3.53163481 - samples/sec: 27.29 - lr: 0.100000 2021-01-25 11:01:49,772 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:01:49,773 EPOCH 23 done: loss 3.6030 - lr 0.1000000 2021-01-25 11:01:49,775 BAD EPOCHS (no improvement): 2 2021-01-25 11:02:29,377 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:02:35,758 epoch 24 - iter 11/111 - loss 3.31554187 - samples/sec: 27.71 - lr: 0.100000 2021-01-25 11:02:42,291 epoch 24 - iter 22/111 - loss 3.24008038 - samples/sec: 26.95 - lr: 0.100000 2021-01-25 11:02:48,284 epoch 24 - iter 33/111 - loss 3.22429945 - samples/sec: 29.38 - lr: 0.100000 2021-01-25 11:02:54,930 epoch 24 - iter 44/111 - loss 3.17169051 - samples/sec: 26.49 - lr: 0.100000 2021-01-25 11:03:01,096 epoch 24 - iter 55/111 - loss 3.30037483 - samples/sec: 28.55 - lr: 0.100000 2021-01-25 11:03:06,742 epoch 24 - iter 66/111 - loss 3.34624892 - samples/sec: 31.18 - lr: 0.100000 2021-01-25 11:03:12,677 epoch 24 - iter 77/111 - loss 3.50012274 - samples/sec: 29.67 - lr: 0.100000 2021-01-25 11:03:18,594 epoch 24 - iter 88/111 - loss 3.51464304 - samples/sec: 29.75 - lr: 0.100000 2021-01-25 11:03:23,317 epoch 24 - iter 99/111 - loss 3.46995135 - samples/sec: 37.28 - lr: 0.100000 2021-01-25 11:03:29,304 epoch 24 - iter 110/111 - loss 3.43061742 - samples/sec: 29.41 - lr: 0.100000 2021-01-25 11:03:29,384 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:03:29,385 EPOCH 24 done: loss 3.4215 - lr 0.1000000 2021-01-25 11:03:29,387 BAD EPOCHS (no improvement): 0 2021-01-25 11:04:04,860 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:04:10,901 epoch 25 - iter 11/111 - loss 2.49696062 - samples/sec: 29.27 - lr: 0.100000 2021-01-25 11:04:15,997 epoch 25 - iter 22/111 - loss 2.89337932 - samples/sec: 34.56 - lr: 0.100000 2021-01-25 11:04:21,767 epoch 25 - iter 33/111 - loss 3.31816685 - samples/sec: 30.51 - lr: 0.100000 2021-01-25 11:04:28,760 epoch 25 - iter 44/111 - loss 3.49047833 - samples/sec: 25.18 - lr: 0.100000 2021-01-25 11:04:35,541 epoch 25 - iter 55/111 - loss 3.45313112 - samples/sec: 25.96 - lr: 0.100000 2021-01-25 11:04:40,781 epoch 25 - iter 66/111 - loss 3.42519531 - samples/sec: 33.60 - lr: 0.100000 2021-01-25 11:04:46,016 epoch 25 - iter 77/111 - loss 3.33971619 - samples/sec: 33.64 - lr: 0.100000 2021-01-25 11:04:51,549 epoch 25 - iter 88/111 - loss 3.34791419 - samples/sec: 31.82 - lr: 0.100000 2021-01-25 11:04:58,537 epoch 25 - iter 99/111 - loss 3.50522223 - samples/sec: 25.19 - lr: 0.100000 2021-01-25 11:05:04,809 epoch 25 - iter 110/111 - loss 3.47983645 - samples/sec: 28.07 - lr: 0.100000 2021-01-25 11:05:04,891 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:05:04,892 EPOCH 25 done: loss 3.4488 - lr 0.1000000 2021-01-25 11:05:04,893 BAD EPOCHS (no improvement): 1 2021-01-25 11:05:45,636 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:05:51,515 epoch 26 - iter 11/111 - loss 3.10320271 - samples/sec: 30.09 - lr: 0.100000 2021-01-25 11:05:57,019 epoch 26 - iter 22/111 - loss 3.01847404 - samples/sec: 31.99 - lr: 0.100000 2021-01-25 11:06:02,514 epoch 26 - iter 33/111 - loss 3.25418358 - samples/sec: 32.04 - lr: 0.100000 2021-01-25 11:06:07,925 epoch 26 - iter 44/111 - loss 3.15338173 - samples/sec: 32.54 - lr: 0.100000 2021-01-25 11:06:13,548 epoch 26 - iter 55/111 - loss 3.22959855 - samples/sec: 31.31 - lr: 0.100000 2021-01-25 11:06:20,368 epoch 26 - iter 66/111 - loss 3.21547249 - samples/sec: 25.82 - lr: 0.100000 2021-01-25 11:06:26,915 epoch 26 - iter 77/111 - loss 3.33247595 - samples/sec: 26.89 - lr: 0.100000 2021-01-25 11:06:33,496 epoch 26 - iter 88/111 - loss 3.27858888 - samples/sec: 26.75 - lr: 0.100000 2021-01-25 11:06:39,607 epoch 26 - iter 99/111 - loss 3.46542293 - samples/sec: 28.81 - lr: 0.100000 2021-01-25 11:06:45,602 epoch 26 - iter 110/111 - loss 3.43672350 - samples/sec: 29.37 - lr: 0.100000 2021-01-25 11:06:45,644 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:06:45,645 EPOCH 26 done: loss 3.4065 - lr 0.1000000 2021-01-25 11:06:45,647 BAD EPOCHS (no improvement): 0 2021-01-25 11:07:20,171 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:07:26,335 epoch 27 - iter 11/111 - loss 3.98224313 - samples/sec: 28.69 - lr: 0.100000 2021-01-25 11:07:31,866 epoch 27 - iter 22/111 - loss 3.56279709 - samples/sec: 31.83 - lr: 0.100000 2021-01-25 11:07:36,922 epoch 27 - iter 33/111 - loss 3.45221030 - samples/sec: 34.82 - lr: 0.100000 2021-01-25 11:07:42,692 epoch 27 - iter 44/111 - loss 3.36433793 - samples/sec: 30.51 - lr: 0.100000 2021-01-25 11:07:49,062 epoch 27 - iter 55/111 - loss 3.26470423 - samples/sec: 27.64 - lr: 0.100000 2021-01-25 11:07:56,099 epoch 27 - iter 66/111 - loss 3.23170023 - samples/sec: 25.02 - lr: 0.100000 2021-01-25 11:08:02,565 epoch 27 - iter 77/111 - loss 3.20683398 - samples/sec: 27.23 - lr: 0.100000 2021-01-25 11:08:08,627 epoch 27 - iter 88/111 - loss 3.15380178 - samples/sec: 29.05 - lr: 0.100000 2021-01-25 11:08:14,552 epoch 27 - iter 99/111 - loss 3.45005764 - samples/sec: 29.71 - lr: 0.100000 2021-01-25 11:08:19,726 epoch 27 - iter 110/111 - loss 3.46309766 - samples/sec: 34.03 - lr: 0.100000 2021-01-25 11:08:19,866 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:08:19,868 EPOCH 27 done: loss 3.4552 - lr 0.1000000 2021-01-25 11:08:19,869 BAD EPOCHS (no improvement): 1 2021-01-25 11:08:59,806 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:09:05,855 epoch 28 - iter 11/111 - loss 2.89055541 - samples/sec: 29.24 - lr: 0.100000 2021-01-25 11:09:11,744 epoch 28 - iter 22/111 - loss 2.84430294 - samples/sec: 29.90 - lr: 0.100000 2021-01-25 11:09:17,793 epoch 28 - iter 33/111 - loss 2.93531159 - samples/sec: 29.10 - lr: 0.100000 2021-01-25 11:09:23,659 epoch 28 - iter 44/111 - loss 3.14092332 - samples/sec: 30.01 - lr: 0.100000 2021-01-25 11:09:28,983 epoch 28 - iter 55/111 - loss 3.12229096 - samples/sec: 33.07 - lr: 0.100000 2021-01-25 11:09:35,243 epoch 28 - iter 66/111 - loss 3.18796567 - samples/sec: 28.13 - lr: 0.100000 2021-01-25 11:09:42,254 epoch 28 - iter 77/111 - loss 3.14619401 - samples/sec: 25.11 - lr: 0.100000 2021-01-25 11:09:47,960 epoch 28 - iter 88/111 - loss 3.09235497 - samples/sec: 30.86 - lr: 0.100000 2021-01-25 11:09:53,263 epoch 28 - iter 99/111 - loss 3.08971684 - samples/sec: 33.20 - lr: 0.100000 2021-01-25 11:09:59,032 epoch 28 - iter 110/111 - loss 3.16876111 - samples/sec: 30.52 - lr: 0.100000 2021-01-25 11:09:59,090 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:09:59,091 EPOCH 28 done: loss 3.1404 - lr 0.1000000 2021-01-25 11:09:59,092 BAD EPOCHS (no improvement): 0 2021-01-25 11:10:37,935 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:10:44,074 epoch 29 - iter 11/111 - loss 3.22099484 - samples/sec: 28.82 - lr: 0.100000 2021-01-25 11:10:49,804 epoch 29 - iter 22/111 - loss 3.05376521 - samples/sec: 30.73 - lr: 0.100000 2021-01-25 11:10:55,385 epoch 29 - iter 33/111 - loss 2.97911998 - samples/sec: 31.55 - lr: 0.100000 2021-01-25 11:11:00,865 epoch 29 - iter 44/111 - loss 2.89637794 - samples/sec: 32.13 - lr: 0.100000 2021-01-25 11:11:05,908 epoch 29 - iter 55/111 - loss 2.89810633 - samples/sec: 34.91 - lr: 0.100000 2021-01-25 11:11:11,582 epoch 29 - iter 66/111 - loss 2.88006197 - samples/sec: 31.03 - lr: 0.100000 2021-01-25 11:11:17,157 epoch 29 - iter 77/111 - loss 3.04843036 - samples/sec: 31.59 - lr: 0.100000 2021-01-25 11:11:23,679 epoch 29 - iter 88/111 - loss 3.12833392 - samples/sec: 26.99 - lr: 0.100000 2021-01-25 11:11:30,000 epoch 29 - iter 99/111 - loss 3.12368597 - samples/sec: 27.85 - lr: 0.100000 2021-01-25 11:11:36,775 epoch 29 - iter 110/111 - loss 3.18379832 - samples/sec: 25.99 - lr: 0.100000 2021-01-25 11:11:36,951 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:11:36,952 EPOCH 29 done: loss 3.1957 - lr 0.1000000 2021-01-25 11:11:36,954 BAD EPOCHS (no improvement): 1 2021-01-25 11:12:17,596 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:12:24,477 epoch 30 - iter 11/111 - loss 3.46531834 - samples/sec: 25.68 - lr: 0.100000 2021-01-25 11:12:30,704 epoch 30 - iter 22/111 - loss 3.06817406 - samples/sec: 28.28 - lr: 0.100000 2021-01-25 11:12:37,324 epoch 30 - iter 33/111 - loss 3.18402290 - samples/sec: 26.59 - lr: 0.100000 2021-01-25 11:12:43,955 epoch 30 - iter 44/111 - loss 3.13363074 - samples/sec: 26.55 - lr: 0.100000 2021-01-25 11:12:50,451 epoch 30 - iter 55/111 - loss 3.26960817 - samples/sec: 27.10 - lr: 0.100000 2021-01-25 11:12:55,731 epoch 30 - iter 66/111 - loss 3.23844153 - samples/sec: 33.35 - lr: 0.100000 2021-01-25 11:13:01,505 epoch 30 - iter 77/111 - loss 3.31905947 - samples/sec: 30.49 - lr: 0.100000 2021-01-25 11:13:06,276 epoch 30 - iter 88/111 - loss 3.28804243 - samples/sec: 36.90 - lr: 0.100000 2021-01-25 11:13:10,826 epoch 30 - iter 99/111 - loss 3.17421255 - samples/sec: 38.70 - lr: 0.100000 2021-01-25 11:13:16,399 epoch 30 - iter 110/111 - loss 3.14507711 - samples/sec: 31.60 - lr: 0.100000 2021-01-25 11:13:16,459 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:13:16,460 EPOCH 30 done: loss 3.1245 - lr 0.1000000 2021-01-25 11:13:16,461 BAD EPOCHS (no improvement): 0 2021-01-25 11:13:48,323 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:13:54,302 epoch 31 - iter 11/111 - loss 2.79689815 - samples/sec: 29.61 - lr: 0.100000 2021-01-25 11:14:02,827 epoch 31 - iter 22/111 - loss 3.26084276 - samples/sec: 32.70 - lr: 0.100000 2021-01-25 11:14:08,369 epoch 31 - iter 33/111 - loss 3.10742700 - samples/sec: 31.77 - lr: 0.100000 2021-01-25 11:14:14,760 epoch 31 - iter 44/111 - loss 3.27269035 - samples/sec: 27.55 - lr: 0.100000 2021-01-25 11:14:20,358 epoch 31 - iter 55/111 - loss 3.21613452 - samples/sec: 31.45 - lr: 0.100000 2021-01-25 11:14:27,100 epoch 31 - iter 66/111 - loss 3.29908703 - samples/sec: 26.11 - lr: 0.100000 2021-01-25 11:14:32,671 epoch 31 - iter 77/111 - loss 3.16357533 - samples/sec: 31.61 - lr: 0.100000 2021-01-25 11:14:37,746 epoch 31 - iter 88/111 - loss 3.26563977 - samples/sec: 34.69 - lr: 0.100000 2021-01-25 11:14:44,277 epoch 31 - iter 99/111 - loss 3.23845681 - samples/sec: 26.96 - lr: 0.100000 2021-01-25 11:14:49,849 epoch 31 - iter 110/111 - loss 3.24399293 - samples/sec: 31.60 - lr: 0.100000 2021-01-25 11:14:49,914 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:14:49,915 EPOCH 31 done: loss 3.2158 - lr 0.1000000 2021-01-25 11:14:49,917 BAD EPOCHS (no improvement): 1 2021-01-25 11:15:32,153 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:15:37,308 epoch 32 - iter 11/111 - loss 3.74671636 - samples/sec: 34.38 - lr: 0.100000 2021-01-25 11:15:42,566 epoch 32 - iter 22/111 - loss 3.52645666 - samples/sec: 33.48 - lr: 0.100000 2021-01-25 11:15:48,879 epoch 32 - iter 33/111 - loss 3.48688529 - samples/sec: 27.89 - lr: 0.100000 2021-01-25 11:15:56,774 epoch 32 - iter 44/111 - loss 3.64896190 - samples/sec: 22.30 - lr: 0.100000 2021-01-25 11:16:02,925 epoch 32 - iter 55/111 - loss 3.50396811 - samples/sec: 28.63 - lr: 0.100000 2021-01-25 11:16:08,170 epoch 32 - iter 66/111 - loss 3.53260714 - samples/sec: 33.57 - lr: 0.100000 2021-01-25 11:16:13,181 epoch 32 - iter 77/111 - loss 3.36382006 - samples/sec: 35.14 - lr: 0.100000 2021-01-25 11:16:18,659 epoch 32 - iter 88/111 - loss 3.27387298 - samples/sec: 32.14 - lr: 0.100000 2021-01-25 11:16:24,095 epoch 32 - iter 99/111 - loss 3.20687566 - samples/sec: 32.39 - lr: 0.100000 2021-01-25 11:16:29,952 epoch 32 - iter 110/111 - loss 3.18156727 - samples/sec: 30.06 - lr: 0.100000 2021-01-25 11:16:30,028 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:16:30,030 EPOCH 32 done: loss 3.1845 - lr 0.1000000 2021-01-25 11:16:30,032 BAD EPOCHS (no improvement): 2 2021-01-25 11:17:02,234 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:17:08,194 epoch 33 - iter 11/111 - loss 3.03868398 - samples/sec: 29.68 - lr: 0.100000 2021-01-25 11:17:13,556 epoch 33 - iter 22/111 - loss 2.92616935 - samples/sec: 32.84 - lr: 0.100000 2021-01-25 11:17:19,336 epoch 33 - iter 33/111 - loss 2.80745126 - samples/sec: 30.46 - lr: 0.100000 2021-01-25 11:17:25,547 epoch 33 - iter 44/111 - loss 2.89847333 - samples/sec: 28.34 - lr: 0.100000 2021-01-25 11:17:31,706 epoch 33 - iter 55/111 - loss 2.86311074 - samples/sec: 28.59 - lr: 0.100000 2021-01-25 11:17:37,953 epoch 33 - iter 66/111 - loss 2.88540350 - samples/sec: 28.18 - lr: 0.100000 2021-01-25 11:17:43,600 epoch 33 - iter 77/111 - loss 2.81552082 - samples/sec: 31.18 - lr: 0.100000 2021-01-25 11:17:49,715 epoch 33 - iter 88/111 - loss 2.95309269 - samples/sec: 28.79 - lr: 0.100000 2021-01-25 11:17:54,904 epoch 33 - iter 99/111 - loss 2.96507429 - samples/sec: 33.93 - lr: 0.100000 2021-01-25 11:18:01,023 epoch 33 - iter 110/111 - loss 2.95499291 - samples/sec: 28.77 - lr: 0.100000 2021-01-25 11:18:01,097 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:18:01,098 EPOCH 33 done: loss 2.9287 - lr 0.1000000 2021-01-25 11:18:01,100 BAD EPOCHS (no improvement): 0 2021-01-25 11:18:33,120 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:18:38,560 epoch 34 - iter 11/111 - loss 3.07879266 - samples/sec: 32.54 - lr: 0.100000 2021-01-25 11:18:44,164 epoch 34 - iter 22/111 - loss 2.81848154 - samples/sec: 31.41 - lr: 0.100000 2021-01-25 11:18:50,296 epoch 34 - iter 33/111 - loss 3.11980985 - samples/sec: 28.71 - lr: 0.100000 2021-01-25 11:18:55,684 epoch 34 - iter 44/111 - loss 3.13238976 - samples/sec: 32.69 - lr: 0.100000 2021-01-25 11:19:01,449 epoch 34 - iter 55/111 - loss 3.16149788 - samples/sec: 30.54 - lr: 0.100000 2021-01-25 11:19:08,616 epoch 34 - iter 66/111 - loss 3.13348814 - samples/sec: 24.56 - lr: 0.100000 2021-01-25 11:19:14,294 epoch 34 - iter 77/111 - loss 3.11406816 - samples/sec: 31.01 - lr: 0.100000 2021-01-25 11:19:21,730 epoch 34 - iter 88/111 - loss 3.18820072 - samples/sec: 23.68 - lr: 0.100000 2021-01-25 11:19:26,916 epoch 34 - iter 99/111 - loss 3.10712424 - samples/sec: 33.95 - lr: 0.100000 2021-01-25 11:19:32,893 epoch 34 - iter 110/111 - loss 3.03698263 - samples/sec: 29.46 - lr: 0.100000 2021-01-25 11:19:32,970 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:19:32,971 EPOCH 34 done: loss 3.0097 - lr 0.1000000 2021-01-25 11:19:32,972 BAD EPOCHS (no improvement): 1 2021-01-25 11:20:10,650 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:20:17,062 epoch 35 - iter 11/111 - loss 3.51922204 - samples/sec: 27.58 - lr: 0.100000 2021-01-25 11:20:22,803 epoch 35 - iter 22/111 - loss 3.35290692 - samples/sec: 30.67 - lr: 0.100000 2021-01-25 11:20:28,408 epoch 35 - iter 33/111 - loss 3.12284018 - samples/sec: 31.42 - lr: 0.100000 2021-01-25 11:20:34,268 epoch 35 - iter 44/111 - loss 3.06114052 - samples/sec: 30.05 - lr: 0.100000 2021-01-25 11:20:39,727 epoch 35 - iter 55/111 - loss 2.91180301 - samples/sec: 32.25 - lr: 0.100000 2021-01-25 11:20:45,826 epoch 35 - iter 66/111 - loss 2.98474466 - samples/sec: 28.87 - lr: 0.100000 2021-01-25 11:20:52,335 epoch 35 - iter 77/111 - loss 2.96009049 - samples/sec: 27.05 - lr: 0.100000 2021-01-25 11:20:57,955 epoch 35 - iter 88/111 - loss 2.97090394 - samples/sec: 31.33 - lr: 0.100000 2021-01-25 11:21:03,663 epoch 35 - iter 99/111 - loss 2.96750320 - samples/sec: 30.85 - lr: 0.100000 2021-01-25 11:21:09,346 epoch 35 - iter 110/111 - loss 2.93056582 - samples/sec: 30.98 - lr: 0.100000 2021-01-25 11:21:09,429 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:21:09,430 EPOCH 35 done: loss 2.9119 - lr 0.1000000 2021-01-25 11:21:09,431 BAD EPOCHS (no improvement): 0 2021-01-25 11:21:49,619 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:21:55,855 epoch 36 - iter 11/111 - loss 2.72645469 - samples/sec: 28.36 - lr: 0.100000 2021-01-25 11:22:01,716 epoch 36 - iter 22/111 - loss 2.69821284 - samples/sec: 30.04 - lr: 0.100000 2021-01-25 11:22:07,199 epoch 36 - iter 33/111 - loss 2.89728564 - samples/sec: 32.11 - lr: 0.100000 2021-01-25 11:22:13,290 epoch 36 - iter 44/111 - loss 2.90264571 - samples/sec: 28.91 - lr: 0.100000 2021-01-25 11:22:19,946 epoch 36 - iter 55/111 - loss 2.78748222 - samples/sec: 26.45 - lr: 0.100000 2021-01-25 11:22:26,139 epoch 36 - iter 66/111 - loss 2.74974000 - samples/sec: 28.43 - lr: 0.100000 2021-01-25 11:22:31,414 epoch 36 - iter 77/111 - loss 2.79338780 - samples/sec: 33.38 - lr: 0.100000 2021-01-25 11:22:36,547 epoch 36 - iter 88/111 - loss 2.83762308 - samples/sec: 34.30 - lr: 0.100000 2021-01-25 11:22:42,755 epoch 36 - iter 99/111 - loss 2.87280162 - samples/sec: 28.36 - lr: 0.100000 2021-01-25 11:22:48,190 epoch 36 - iter 110/111 - loss 2.94247575 - samples/sec: 32.39 - lr: 0.100000 2021-01-25 11:22:48,275 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:22:48,276 EPOCH 36 done: loss 2.9494 - lr 0.1000000 2021-01-25 11:22:48,278 BAD EPOCHS (no improvement): 1 2021-01-25 11:23:21,739 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:23:28,153 epoch 37 - iter 11/111 - loss 3.16743422 - samples/sec: 27.57 - lr: 0.100000 2021-01-25 11:23:34,343 epoch 37 - iter 22/111 - loss 2.86954344 - samples/sec: 28.45 - lr: 0.100000 2021-01-25 11:23:40,250 epoch 37 - iter 33/111 - loss 2.88924729 - samples/sec: 29.81 - lr: 0.100000 2021-01-25 11:23:46,142 epoch 37 - iter 44/111 - loss 2.83907682 - samples/sec: 29.88 - lr: 0.100000 2021-01-25 11:23:51,894 epoch 37 - iter 55/111 - loss 2.90263764 - samples/sec: 30.61 - lr: 0.100000 2021-01-25 11:25:47,261 epoch 37 - iter 66/111 - loss 2.91293082 - samples/sec: 30.21 - lr: 0.100000 2021-01-25 11:25:52,068 epoch 37 - iter 77/111 - loss 2.94032987 - samples/sec: 36.63 - lr: 0.100000 2021-01-25 11:25:58,409 epoch 37 - iter 88/111 - loss 3.03329203 - samples/sec: 27.77 - lr: 0.100000 2021-01-25 11:26:04,372 epoch 37 - iter 99/111 - loss 2.92551837 - samples/sec: 29.53 - lr: 0.100000 2021-01-25 11:26:10,677 epoch 37 - iter 110/111 - loss 2.90970864 - samples/sec: 27.92 - lr: 0.100000 2021-01-25 11:26:10,722 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:26:10,723 EPOCH 37 done: loss 2.9066 - lr 0.1000000 2021-01-25 11:26:10,724 BAD EPOCHS (no improvement): 0 2021-01-25 11:26:43,034 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:26:57,593 epoch 38 - iter 11/111 - loss 2.89960952 - samples/sec: 27.43 - lr: 0.100000 2021-01-25 11:27:03,764 epoch 38 - iter 22/111 - loss 2.87588685 - samples/sec: 28.53 - lr: 0.100000 2021-01-25 11:27:10,100 epoch 38 - iter 33/111 - loss 2.99533075 - samples/sec: 27.79 - lr: 0.100000 2021-01-25 11:27:15,235 epoch 38 - iter 44/111 - loss 3.05135858 - samples/sec: 34.29 - lr: 0.100000 2021-01-25 11:27:21,292 epoch 38 - iter 55/111 - loss 3.06756002 - samples/sec: 29.07 - lr: 0.100000 2021-01-25 11:27:27,636 epoch 38 - iter 66/111 - loss 3.15773214 - samples/sec: 27.75 - lr: 0.100000 2021-01-25 11:27:33,665 epoch 38 - iter 77/111 - loss 3.07934983 - samples/sec: 29.20 - lr: 0.100000 2021-01-25 11:27:38,549 epoch 38 - iter 88/111 - loss 3.04182896 - samples/sec: 36.05 - lr: 0.100000 2021-01-25 11:27:44,612 epoch 38 - iter 99/111 - loss 2.98516604 - samples/sec: 29.04 - lr: 0.100000 2021-01-25 11:27:49,900 epoch 38 - iter 110/111 - loss 2.92953254 - samples/sec: 33.30 - lr: 0.100000 2021-01-25 11:27:49,963 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:27:49,965 EPOCH 38 done: loss 2.9575 - lr 0.1000000 2021-01-25 11:27:49,968 BAD EPOCHS (no improvement): 1 2021-01-25 11:28:24,985 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:28:31,325 epoch 39 - iter 11/111 - loss 2.43373463 - samples/sec: 27.90 - lr: 0.100000 2021-01-25 11:28:36,511 epoch 39 - iter 22/111 - loss 2.33439837 - samples/sec: 33.95 - lr: 0.100000 2021-01-25 11:28:42,453 epoch 39 - iter 33/111 - loss 2.55580492 - samples/sec: 29.63 - lr: 0.100000 2021-01-25 11:28:48,638 epoch 39 - iter 44/111 - loss 2.69147430 - samples/sec: 28.47 - lr: 0.100000 2021-01-25 11:28:54,154 epoch 39 - iter 55/111 - loss 2.66834773 - samples/sec: 31.92 - lr: 0.100000 2021-01-25 11:29:00,444 epoch 39 - iter 66/111 - loss 2.74799052 - samples/sec: 27.99 - lr: 0.100000 2021-01-25 11:29:06,221 epoch 39 - iter 77/111 - loss 2.77623356 - samples/sec: 30.48 - lr: 0.100000 2021-01-25 11:29:12,615 epoch 39 - iter 88/111 - loss 2.76129756 - samples/sec: 27.54 - lr: 0.100000 2021-01-25 11:29:19,051 epoch 39 - iter 99/111 - loss 2.85527288 - samples/sec: 27.35 - lr: 0.100000 2021-01-25 11:29:24,608 epoch 39 - iter 110/111 - loss 2.81097984 - samples/sec: 31.69 - lr: 0.100000 2021-01-25 11:29:24,723 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:29:24,724 EPOCH 39 done: loss 2.8238 - lr 0.1000000 2021-01-25 11:29:24,726 BAD EPOCHS (no improvement): 0 2021-01-25 11:30:00,733 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:30:06,753 epoch 40 - iter 11/111 - loss 3.48423505 - samples/sec: 29.38 - lr: 0.100000 2021-01-25 11:30:12,671 epoch 40 - iter 22/111 - loss 3.41717418 - samples/sec: 29.75 - lr: 0.100000 2021-01-25 11:30:17,946 epoch 40 - iter 33/111 - loss 3.23930618 - samples/sec: 33.38 - lr: 0.100000 2021-01-25 11:30:24,660 epoch 40 - iter 44/111 - loss 2.97311551 - samples/sec: 26.22 - lr: 0.100000 2021-01-25 11:30:31,169 epoch 40 - iter 55/111 - loss 3.17618476 - samples/sec: 27.05 - lr: 0.100000 2021-01-25 11:30:36,598 epoch 40 - iter 66/111 - loss 3.07176837 - samples/sec: 32.43 - lr: 0.100000 2021-01-25 11:30:42,171 epoch 40 - iter 77/111 - loss 3.03386935 - samples/sec: 31.60 - lr: 0.100000 2021-01-25 11:30:48,842 epoch 40 - iter 88/111 - loss 2.95677129 - samples/sec: 26.39 - lr: 0.100000 2021-01-25 11:30:54,059 epoch 40 - iter 99/111 - loss 2.93549863 - samples/sec: 33.75 - lr: 0.100000 2021-01-25 11:31:00,560 epoch 40 - iter 110/111 - loss 2.89542588 - samples/sec: 27.08 - lr: 0.100000 2021-01-25 11:31:00,737 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:31:00,738 EPOCH 40 done: loss 2.8951 - lr 0.1000000 2021-01-25 11:31:00,740 BAD EPOCHS (no improvement): 1 2021-01-25 11:31:41,609 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:31:47,751 epoch 41 - iter 11/111 - loss 2.90424699 - samples/sec: 28.79 - lr: 0.100000 2021-01-25 11:31:53,675 epoch 41 - iter 22/111 - loss 2.57281015 - samples/sec: 29.72 - lr: 0.100000 2021-01-25 11:31:59,731 epoch 41 - iter 33/111 - loss 2.72727507 - samples/sec: 29.07 - lr: 0.100000 2021-01-25 11:32:05,693 epoch 41 - iter 44/111 - loss 2.71146367 - samples/sec: 29.53 - lr: 0.100000 2021-01-25 11:32:10,952 epoch 41 - iter 55/111 - loss 2.68553722 - samples/sec: 33.48 - lr: 0.100000 2021-01-25 11:32:16,759 epoch 41 - iter 66/111 - loss 2.65014202 - samples/sec: 30.32 - lr: 0.100000 2021-01-25 11:32:22,723 epoch 41 - iter 77/111 - loss 2.65167695 - samples/sec: 29.52 - lr: 0.100000 2021-01-25 11:32:28,376 epoch 41 - iter 88/111 - loss 2.62132972 - samples/sec: 31.15 - lr: 0.100000 2021-01-25 11:32:34,829 epoch 41 - iter 99/111 - loss 2.77568768 - samples/sec: 27.28 - lr: 0.100000 2021-01-25 11:32:40,653 epoch 41 - iter 110/111 - loss 2.77192337 - samples/sec: 30.23 - lr: 0.100000 2021-01-25 11:32:40,698 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:32:40,699 EPOCH 41 done: loss 2.7549 - lr 0.1000000 2021-01-25 11:32:40,701 BAD EPOCHS (no improvement): 0 2021-01-25 11:33:21,873 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:33:27,380 epoch 42 - iter 11/111 - loss 3.18380791 - samples/sec: 32.33 - lr: 0.100000 2021-01-25 11:33:33,526 epoch 42 - iter 22/111 - loss 3.05797781 - samples/sec: 28.65 - lr: 0.100000 2021-01-25 11:33:39,224 epoch 42 - iter 33/111 - loss 2.81637545 - samples/sec: 30.90 - lr: 0.100000 2021-01-25 11:33:45,605 epoch 42 - iter 44/111 - loss 2.86018511 - samples/sec: 27.59 - lr: 0.100000 2021-01-25 11:33:51,446 epoch 42 - iter 55/111 - loss 2.74580269 - samples/sec: 30.14 - lr: 0.100000 2021-01-25 11:33:58,656 epoch 42 - iter 66/111 - loss 2.91552790 - samples/sec: 24.42 - lr: 0.100000 2021-01-25 11:34:04,366 epoch 42 - iter 77/111 - loss 2.94514490 - samples/sec: 30.83 - lr: 0.100000 2021-01-25 11:34:09,585 epoch 42 - iter 88/111 - loss 2.88403995 - samples/sec: 33.74 - lr: 0.100000 2021-01-25 11:34:15,681 epoch 42 - iter 99/111 - loss 2.84163353 - samples/sec: 28.88 - lr: 0.100000 2021-01-25 11:34:22,679 epoch 42 - iter 110/111 - loss 2.82954320 - samples/sec: 25.16 - lr: 0.100000 2021-01-25 11:34:22,767 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:34:22,769 EPOCH 42 done: loss 2.8226 - lr 0.1000000 2021-01-25 11:34:22,771 BAD EPOCHS (no improvement): 1 2021-01-25 11:35:00,539 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:35:07,219 epoch 43 - iter 11/111 - loss 2.36389663 - samples/sec: 26.47 - lr: 0.100000 2021-01-25 11:35:13,165 epoch 43 - iter 22/111 - loss 2.36934274 - samples/sec: 29.61 - lr: 0.100000 2021-01-25 11:35:19,465 epoch 43 - iter 33/111 - loss 2.49502711 - samples/sec: 27.95 - lr: 0.100000 2021-01-25 11:35:26,508 epoch 43 - iter 44/111 - loss 2.50396190 - samples/sec: 25.00 - lr: 0.100000 2021-01-25 11:35:31,971 epoch 43 - iter 55/111 - loss 2.59746035 - samples/sec: 32.23 - lr: 0.100000 2021-01-25 11:35:38,015 epoch 43 - iter 66/111 - loss 2.54767620 - samples/sec: 29.13 - lr: 0.100000 2021-01-25 11:35:43,363 epoch 43 - iter 77/111 - loss 2.62820696 - samples/sec: 32.92 - lr: 0.100000 2021-01-25 11:35:49,198 epoch 43 - iter 88/111 - loss 2.68061659 - samples/sec: 30.17 - lr: 0.100000 2021-01-25 11:35:55,319 epoch 43 - iter 99/111 - loss 2.67624498 - samples/sec: 28.76 - lr: 0.100000 2021-01-25 11:36:01,492 epoch 43 - iter 110/111 - loss 2.73217012 - samples/sec: 28.52 - lr: 0.100000 2021-01-25 11:36:01,876 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:36:01,877 EPOCH 43 done: loss 3.1642 - lr 0.1000000 2021-01-25 11:36:01,878 BAD EPOCHS (no improvement): 2 2021-01-25 11:36:37,325 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:36:42,873 epoch 44 - iter 11/111 - loss 2.01169826 - samples/sec: 31.89 - lr: 0.100000 2021-01-25 11:36:48,819 epoch 44 - iter 22/111 - loss 1.95144150 - samples/sec: 29.61 - lr: 0.100000 2021-01-25 11:36:54,726 epoch 44 - iter 33/111 - loss 2.61340152 - samples/sec: 29.80 - lr: 0.100000 2021-01-25 11:37:01,965 epoch 44 - iter 44/111 - loss 2.59197275 - samples/sec: 24.32 - lr: 0.100000 2021-01-25 11:37:08,116 epoch 44 - iter 55/111 - loss 2.56804570 - samples/sec: 28.62 - lr: 0.100000 2021-01-25 11:37:14,626 epoch 44 - iter 66/111 - loss 2.60264895 - samples/sec: 27.04 - lr: 0.100000 2021-01-25 11:37:20,683 epoch 44 - iter 77/111 - loss 2.59730850 - samples/sec: 29.07 - lr: 0.100000 2021-01-25 11:37:27,367 epoch 44 - iter 88/111 - loss 2.64072504 - samples/sec: 26.34 - lr: 0.100000 2021-01-25 11:37:32,954 epoch 44 - iter 99/111 - loss 2.68413012 - samples/sec: 31.51 - lr: 0.100000 2021-01-25 11:37:38,032 epoch 44 - iter 110/111 - loss 2.63417917 - samples/sec: 34.67 - lr: 0.100000 2021-01-25 11:37:38,143 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:37:38,144 EPOCH 44 done: loss 2.6151 - lr 0.1000000 2021-01-25 11:37:38,145 BAD EPOCHS (no improvement): 0 2021-01-25 11:38:11,983 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:38:18,829 epoch 45 - iter 11/111 - loss 2.21903487 - samples/sec: 25.82 - lr: 0.100000 2021-01-25 11:38:24,881 epoch 45 - iter 22/111 - loss 2.40982877 - samples/sec: 29.09 - lr: 0.100000 2021-01-25 11:38:30,841 epoch 45 - iter 33/111 - loss 2.28160072 - samples/sec: 29.54 - lr: 0.100000 2021-01-25 11:38:36,413 epoch 45 - iter 44/111 - loss 2.28662980 - samples/sec: 31.60 - lr: 0.100000 2021-01-25 11:38:41,462 epoch 45 - iter 55/111 - loss 2.28525278 - samples/sec: 34.87 - lr: 0.100000 2021-01-25 11:38:47,995 epoch 45 - iter 66/111 - loss 2.45093957 - samples/sec: 26.95 - lr: 0.100000 2021-01-25 11:38:54,588 epoch 45 - iter 77/111 - loss 2.48274891 - samples/sec: 26.70 - lr: 0.100000 2021-01-25 11:39:01,052 epoch 45 - iter 88/111 - loss 2.53001037 - samples/sec: 27.24 - lr: 0.100000 2021-01-25 11:39:06,848 epoch 45 - iter 99/111 - loss 2.54270450 - samples/sec: 30.38 - lr: 0.100000 2021-01-25 11:39:12,350 epoch 45 - iter 110/111 - loss 2.51931607 - samples/sec: 32.00 - lr: 0.100000 2021-01-25 11:39:12,398 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:39:12,399 EPOCH 45 done: loss 2.4972 - lr 0.1000000 2021-01-25 11:39:12,400 BAD EPOCHS (no improvement): 0 2021-01-25 11:39:52,226 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:39:57,872 epoch 46 - iter 11/111 - loss 2.21434944 - samples/sec: 31.33 - lr: 0.100000 2021-01-25 11:40:03,704 epoch 46 - iter 22/111 - loss 2.02364354 - samples/sec: 30.18 - lr: 0.100000 2021-01-25 11:40:10,083 epoch 46 - iter 33/111 - loss 2.15216145 - samples/sec: 27.61 - lr: 0.100000 2021-01-25 11:40:15,635 epoch 46 - iter 44/111 - loss 2.29634910 - samples/sec: 31.71 - lr: 0.100000 2021-01-25 11:40:21,181 epoch 46 - iter 55/111 - loss 2.44234299 - samples/sec: 31.75 - lr: 0.100000 2021-01-25 11:40:27,187 epoch 46 - iter 66/111 - loss 2.52661931 - samples/sec: 29.31 - lr: 0.100000 2021-01-25 11:40:32,445 epoch 46 - iter 77/111 - loss 2.46825149 - samples/sec: 33.49 - lr: 0.100000 2021-01-25 11:40:38,538 epoch 46 - iter 88/111 - loss 2.53915692 - samples/sec: 28.90 - lr: 0.100000 2021-01-25 11:40:45,186 epoch 46 - iter 99/111 - loss 2.51694309 - samples/sec: 26.49 - lr: 0.100000 2021-01-25 11:40:52,149 epoch 46 - iter 110/111 - loss 2.57741266 - samples/sec: 25.28 - lr: 0.100000 2021-01-25 11:40:52,245 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:40:52,247 EPOCH 46 done: loss 2.5804 - lr 0.1000000 2021-01-25 11:40:52,249 BAD EPOCHS (no improvement): 1 2021-01-25 11:41:32,745 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:41:39,120 epoch 47 - iter 11/111 - loss 2.72100385 - samples/sec: 28.39 - lr: 0.100000 2021-01-25 11:41:45,105 epoch 47 - iter 22/111 - loss 2.84815154 - samples/sec: 29.42 - lr: 0.100000 2021-01-25 11:41:51,760 epoch 47 - iter 33/111 - loss 2.61069580 - samples/sec: 26.46 - lr: 0.100000 2021-01-25 11:41:58,406 epoch 47 - iter 44/111 - loss 2.73920726 - samples/sec: 26.49 - lr: 0.100000 2021-01-25 11:42:04,976 epoch 47 - iter 55/111 - loss 2.62572265 - samples/sec: 26.80 - lr: 0.100000 2021-01-25 11:42:10,981 epoch 47 - iter 66/111 - loss 2.64314763 - samples/sec: 29.32 - lr: 0.100000 2021-01-25 11:42:15,623 epoch 47 - iter 77/111 - loss 2.76011765 - samples/sec: 37.93 - lr: 0.100000 2021-01-25 11:42:21,263 epoch 47 - iter 88/111 - loss 2.71644599 - samples/sec: 31.22 - lr: 0.100000 2021-01-25 11:42:26,595 epoch 47 - iter 99/111 - loss 2.65025528 - samples/sec: 33.03 - lr: 0.100000 2021-01-25 11:42:32,684 epoch 47 - iter 110/111 - loss 2.64420135 - samples/sec: 28.91 - lr: 0.100000 2021-01-25 11:42:32,795 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:42:32,796 EPOCH 47 done: loss 2.6228 - lr 0.1000000 2021-01-25 11:42:32,798 BAD EPOCHS (no improvement): 2 2021-01-25 11:43:14,266 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:43:21,426 epoch 48 - iter 11/111 - loss 2.73667067 - samples/sec: 24.68 - lr: 0.100000 2021-01-25 11:43:27,000 epoch 48 - iter 22/111 - loss 2.64127392 - samples/sec: 31.59 - lr: 0.100000 2021-01-25 11:43:32,704 epoch 48 - iter 33/111 - loss 2.43381392 - samples/sec: 30.87 - lr: 0.100000 2021-01-25 11:43:38,117 epoch 48 - iter 44/111 - loss 2.33068804 - samples/sec: 32.52 - lr: 0.100000 2021-01-25 11:43:43,535 epoch 48 - iter 55/111 - loss 2.63984348 - samples/sec: 32.50 - lr: 0.100000 2021-01-25 11:43:49,464 epoch 48 - iter 66/111 - loss 2.68483300 - samples/sec: 29.70 - lr: 0.100000 2021-01-25 11:43:55,970 epoch 48 - iter 77/111 - loss 2.61571536 - samples/sec: 27.06 - lr: 0.100000 2021-01-25 11:44:01,892 epoch 48 - iter 88/111 - loss 2.54865319 - samples/sec: 29.73 - lr: 0.100000 2021-01-25 11:44:07,633 epoch 48 - iter 99/111 - loss 2.54997973 - samples/sec: 30.67 - lr: 0.100000 2021-01-25 11:44:14,094 epoch 48 - iter 110/111 - loss 2.52854875 - samples/sec: 27.25 - lr: 0.100000 2021-01-25 11:44:14,167 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:44:14,168 EPOCH 48 done: loss 2.5071 - lr 0.1000000 2021-01-25 11:44:14,170 BAD EPOCHS (no improvement): 3 2021-01-25 11:44:50,131 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:44:56,307 epoch 49 - iter 11/111 - loss 2.26444705 - samples/sec: 28.64 - lr: 0.100000 2021-01-25 11:45:02,231 epoch 49 - iter 22/111 - loss 2.18552075 - samples/sec: 29.72 - lr: 0.100000 2021-01-25 11:45:07,913 epoch 49 - iter 33/111 - loss 2.28902945 - samples/sec: 30.99 - lr: 0.100000 2021-01-25 11:45:13,749 epoch 49 - iter 44/111 - loss 2.59935950 - samples/sec: 30.17 - lr: 0.100000 2021-01-25 11:45:18,784 epoch 49 - iter 55/111 - loss 2.68127865 - samples/sec: 34.97 - lr: 0.100000 2021-01-25 11:45:25,460 epoch 49 - iter 66/111 - loss 2.67531886 - samples/sec: 26.37 - lr: 0.100000 2021-01-25 11:45:31,614 epoch 49 - iter 77/111 - loss 2.66535183 - samples/sec: 28.61 - lr: 0.100000 2021-01-25 11:45:37,499 epoch 49 - iter 88/111 - loss 2.63391306 - samples/sec: 29.92 - lr: 0.100000 2021-01-25 11:45:44,152 epoch 49 - iter 99/111 - loss 2.59099578 - samples/sec: 26.47 - lr: 0.100000 2021-01-25 11:45:50,581 epoch 49 - iter 110/111 - loss 2.63866038 - samples/sec: 27.38 - lr: 0.100000 2021-01-25 11:45:50,657 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:45:50,659 EPOCH 49 done: loss 2.6151 - lr 0.1000000 Epoch 49: reducing learning rate of group 0 to 5.0000e-02. 2021-01-25 11:45:50,667 BAD EPOCHS (no improvement): 4 2021-01-25 11:46:28,847 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:46:35,034 epoch 50 - iter 11/111 - loss 2.35713735 - samples/sec: 28.58 - lr: 0.050000 2021-01-25 11:46:42,218 epoch 50 - iter 22/111 - loss 2.45717296 - samples/sec: 24.51 - lr: 0.050000 2021-01-25 11:46:47,706 epoch 50 - iter 33/111 - loss 2.63586355 - samples/sec: 32.08 - lr: 0.050000 2021-01-25 11:46:53,127 epoch 50 - iter 44/111 - loss 2.54825116 - samples/sec: 32.48 - lr: 0.050000 2021-01-25 11:46:58,617 epoch 50 - iter 55/111 - loss 2.43891905 - samples/sec: 32.07 - lr: 0.050000 2021-01-25 11:47:05,514 epoch 50 - iter 66/111 - loss 2.33190236 - samples/sec: 25.53 - lr: 0.050000 2021-01-25 11:47:11,049 epoch 50 - iter 77/111 - loss 2.31553948 - samples/sec: 31.81 - lr: 0.050000 2021-01-25 11:47:16,575 epoch 50 - iter 88/111 - loss 2.25683910 - samples/sec: 31.86 - lr: 0.050000 2021-01-25 11:47:22,507 epoch 50 - iter 99/111 - loss 2.23318195 - samples/sec: 29.68 - lr: 0.050000 2021-01-25 11:47:28,443 epoch 50 - iter 110/111 - loss 2.23143409 - samples/sec: 29.66 - lr: 0.050000 2021-01-25 11:47:28,762 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:47:28,763 EPOCH 50 done: loss 2.2292 - lr 0.0500000 2021-01-25 11:47:28,765 BAD EPOCHS (no improvement): 0 2021-01-25 11:48:06,867 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:48:13,783 epoch 51 - iter 11/111 - loss 2.18304784 - samples/sec: 25.55 - lr: 0.050000 2021-01-25 11:48:20,225 epoch 51 - iter 22/111 - loss 2.43048250 - samples/sec: 27.33 - lr: 0.050000 2021-01-25 11:48:25,460 epoch 51 - iter 33/111 - loss 2.37367669 - samples/sec: 33.63 - lr: 0.050000 2021-01-25 11:48:31,618 epoch 51 - iter 44/111 - loss 2.31780951 - samples/sec: 28.59 - lr: 0.050000 2021-01-25 11:48:36,946 epoch 51 - iter 55/111 - loss 2.23305076 - samples/sec: 33.05 - lr: 0.050000 2021-01-25 11:48:42,359 epoch 51 - iter 66/111 - loss 2.17786572 - samples/sec: 32.52 - lr: 0.050000 2021-01-25 11:48:49,045 epoch 51 - iter 77/111 - loss 2.13933476 - samples/sec: 26.34 - lr: 0.050000 2021-01-25 11:48:54,750 epoch 51 - iter 88/111 - loss 2.09521602 - samples/sec: 30.86 - lr: 0.050000 2021-01-25 11:49:00,726 epoch 51 - iter 99/111 - loss 2.12616370 - samples/sec: 29.46 - lr: 0.050000 2021-01-25 11:49:06,797 epoch 51 - iter 110/111 - loss 2.18481968 - samples/sec: 29.00 - lr: 0.050000 2021-01-25 11:49:07,374 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:49:07,376 EPOCH 51 done: loss 2.1909 - lr 0.0500000 2021-01-25 11:49:07,378 BAD EPOCHS (no improvement): 0 2021-01-25 11:49:39,592 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:49:45,456 epoch 52 - iter 11/111 - loss 1.73091649 - samples/sec: 30.16 - lr: 0.050000 2021-01-25 11:49:51,422 epoch 52 - iter 22/111 - loss 1.99832044 - samples/sec: 29.51 - lr: 0.050000 2021-01-25 11:49:57,467 epoch 52 - iter 33/111 - loss 2.19519986 - samples/sec: 29.13 - lr: 0.050000 2021-01-25 11:50:03,072 epoch 52 - iter 44/111 - loss 2.15640568 - samples/sec: 31.42 - lr: 0.050000 2021-01-25 11:50:08,757 epoch 52 - iter 55/111 - loss 2.10974158 - samples/sec: 30.97 - lr: 0.050000 2021-01-25 11:50:14,934 epoch 52 - iter 66/111 - loss 2.09556403 - samples/sec: 28.50 - lr: 0.050000 2021-01-25 11:50:21,124 epoch 52 - iter 77/111 - loss 2.08037419 - samples/sec: 28.44 - lr: 0.050000 2021-01-25 11:50:27,898 epoch 52 - iter 88/111 - loss 2.10373959 - samples/sec: 25.99 - lr: 0.050000 2021-01-25 11:50:33,706 epoch 52 - iter 99/111 - loss 2.10709381 - samples/sec: 30.31 - lr: 0.050000 2021-01-25 11:50:38,604 epoch 52 - iter 110/111 - loss 2.08090206 - samples/sec: 35.95 - lr: 0.050000 2021-01-25 11:50:38,735 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:50:38,736 EPOCH 52 done: loss 2.0625 - lr 0.0500000 2021-01-25 11:50:38,738 BAD EPOCHS (no improvement): 0 2021-01-25 11:51:18,504 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:51:25,288 epoch 53 - iter 11/111 - loss 2.01705449 - samples/sec: 26.04 - lr: 0.050000 2021-01-25 11:51:31,730 epoch 53 - iter 22/111 - loss 1.90370986 - samples/sec: 27.33 - lr: 0.050000 2021-01-25 11:51:38,513 epoch 53 - iter 33/111 - loss 2.09853537 - samples/sec: 25.96 - lr: 0.050000 2021-01-25 11:51:44,080 epoch 53 - iter 44/111 - loss 1.96282133 - samples/sec: 31.63 - lr: 0.050000 2021-01-25 11:51:50,174 epoch 53 - iter 55/111 - loss 2.00787822 - samples/sec: 28.89 - lr: 0.050000 2021-01-25 11:51:55,978 epoch 53 - iter 66/111 - loss 2.12322649 - samples/sec: 30.34 - lr: 0.050000 2021-01-25 11:52:02,389 epoch 53 - iter 77/111 - loss 2.10005822 - samples/sec: 27.47 - lr: 0.050000 2021-01-25 11:52:07,104 epoch 53 - iter 88/111 - loss 2.04746139 - samples/sec: 37.35 - lr: 0.050000 2021-01-25 11:52:13,366 epoch 53 - iter 99/111 - loss 2.01917097 - samples/sec: 28.12 - lr: 0.050000 2021-01-25 11:52:18,845 epoch 53 - iter 110/111 - loss 2.00830920 - samples/sec: 32.13 - lr: 0.050000 2021-01-25 11:52:18,901 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:52:18,902 EPOCH 53 done: loss 1.9910 - lr 0.0500000 2021-01-25 11:52:18,904 BAD EPOCHS (no improvement): 0 2021-01-25 11:52:59,196 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:53:05,183 epoch 54 - iter 11/111 - loss 2.14837797 - samples/sec: 29.54 - lr: 0.050000 2021-01-25 11:53:11,485 epoch 54 - iter 22/111 - loss 2.19486931 - samples/sec: 27.94 - lr: 0.050000 2021-01-25 11:53:17,407 epoch 54 - iter 33/111 - loss 1.91290445 - samples/sec: 29.73 - lr: 0.050000 2021-01-25 11:53:23,905 epoch 54 - iter 44/111 - loss 1.91239043 - samples/sec: 27.10 - lr: 0.050000 2021-01-25 11:53:30,214 epoch 54 - iter 55/111 - loss 1.88720394 - samples/sec: 27.90 - lr: 0.050000 2021-01-25 11:53:35,415 epoch 54 - iter 66/111 - loss 1.93678142 - samples/sec: 33.86 - lr: 0.050000 2021-01-25 11:53:40,723 epoch 54 - iter 77/111 - loss 1.92156433 - samples/sec: 33.17 - lr: 0.050000 2021-01-25 11:53:47,687 epoch 54 - iter 88/111 - loss 1.92170125 - samples/sec: 25.28 - lr: 0.050000 2021-01-25 11:53:53,592 epoch 54 - iter 99/111 - loss 1.87971373 - samples/sec: 29.82 - lr: 0.050000 2021-01-25 11:53:59,032 epoch 54 - iter 110/111 - loss 1.86843346 - samples/sec: 32.37 - lr: 0.050000 2021-01-25 11:53:59,103 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:53:59,105 EPOCH 54 done: loss 1.8519 - lr 0.0500000 2021-01-25 11:53:59,106 BAD EPOCHS (no improvement): 0 2021-01-25 11:54:36,300 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:54:42,957 epoch 55 - iter 11/111 - loss 1.82710967 - samples/sec: 26.55 - lr: 0.050000 2021-01-25 11:54:48,517 epoch 55 - iter 22/111 - loss 1.88735848 - samples/sec: 31.67 - lr: 0.050000 2021-01-25 11:54:54,262 epoch 55 - iter 33/111 - loss 1.85520092 - samples/sec: 30.65 - lr: 0.050000 2021-01-25 11:54:59,177 epoch 55 - iter 44/111 - loss 1.80710835 - samples/sec: 35.82 - lr: 0.050000 2021-01-25 11:55:04,885 epoch 55 - iter 55/111 - loss 1.93934924 - samples/sec: 30.85 - lr: 0.050000 2021-01-25 11:55:10,584 epoch 55 - iter 66/111 - loss 1.92196495 - samples/sec: 30.90 - lr: 0.050000 2021-01-25 11:55:17,184 epoch 55 - iter 77/111 - loss 1.90982055 - samples/sec: 26.67 - lr: 0.050000 2021-01-25 11:55:23,540 epoch 55 - iter 88/111 - loss 1.89618112 - samples/sec: 27.70 - lr: 0.050000 2021-01-25 11:55:29,347 epoch 55 - iter 99/111 - loss 1.89795932 - samples/sec: 30.32 - lr: 0.050000 2021-01-25 11:55:35,140 epoch 55 - iter 110/111 - loss 1.89739936 - samples/sec: 30.39 - lr: 0.050000 2021-01-25 11:55:35,211 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:55:35,212 EPOCH 55 done: loss 1.8805 - lr 0.0500000 2021-01-25 11:55:35,214 BAD EPOCHS (no improvement): 1 2021-01-25 11:56:08,747 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:56:14,068 epoch 56 - iter 11/111 - loss 2.28413005 - samples/sec: 33.26 - lr: 0.050000 2021-01-25 11:56:19,678 epoch 56 - iter 22/111 - loss 2.30415965 - samples/sec: 31.38 - lr: 0.050000 2021-01-25 11:56:24,717 epoch 56 - iter 33/111 - loss 2.04732648 - samples/sec: 34.94 - lr: 0.050000 2021-01-25 11:56:30,054 epoch 56 - iter 44/111 - loss 2.00044952 - samples/sec: 32.99 - lr: 0.050000 2021-01-25 11:56:36,168 epoch 56 - iter 55/111 - loss 1.90858837 - samples/sec: 28.80 - lr: 0.050000 2021-01-25 11:56:43,120 epoch 56 - iter 66/111 - loss 1.88605417 - samples/sec: 25.33 - lr: 0.050000 2021-01-25 11:56:48,158 epoch 56 - iter 77/111 - loss 1.83538555 - samples/sec: 34.95 - lr: 0.050000 2021-01-25 11:56:55,341 epoch 56 - iter 88/111 - loss 1.88052460 - samples/sec: 24.51 - lr: 0.050000 2021-01-25 11:57:01,438 epoch 56 - iter 99/111 - loss 1.87884715 - samples/sec: 28.88 - lr: 0.050000 2021-01-25 11:57:07,653 epoch 56 - iter 110/111 - loss 1.86558872 - samples/sec: 28.33 - lr: 0.050000 2021-01-25 11:57:07,770 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:57:07,771 EPOCH 56 done: loss 1.8633 - lr 0.0500000 2021-01-25 11:57:07,772 BAD EPOCHS (no improvement): 2 2021-01-25 11:57:39,474 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:57:53,492 epoch 57 - iter 11/111 - loss 1.77860214 - samples/sec: 32.45 - lr: 0.050000 2021-01-25 11:57:59,516 epoch 57 - iter 22/111 - loss 1.83906370 - samples/sec: 29.23 - lr: 0.050000 2021-01-25 11:58:06,147 epoch 57 - iter 33/111 - loss 1.94231358 - samples/sec: 26.55 - lr: 0.050000 2021-01-25 11:58:12,925 epoch 57 - iter 44/111 - loss 1.94184118 - samples/sec: 25.98 - lr: 0.050000 2021-01-25 11:58:18,339 epoch 57 - iter 55/111 - loss 1.91480884 - samples/sec: 32.52 - lr: 0.050000 2021-01-25 11:58:24,086 epoch 57 - iter 66/111 - loss 1.92275561 - samples/sec: 30.64 - lr: 0.050000 2021-01-25 11:58:30,082 epoch 57 - iter 77/111 - loss 1.90475914 - samples/sec: 29.36 - lr: 0.050000 2021-01-25 11:58:35,788 epoch 57 - iter 88/111 - loss 1.85559301 - samples/sec: 30.86 - lr: 0.050000 2021-01-25 11:58:42,359 epoch 57 - iter 99/111 - loss 1.85587502 - samples/sec: 26.79 - lr: 0.050000 2021-01-25 11:58:47,262 epoch 57 - iter 110/111 - loss 1.92152234 - samples/sec: 35.92 - lr: 0.050000 2021-01-25 11:58:47,447 ---------------------------------------------------------------------------------------------------- 2021-01-25 11:58:47,448 EPOCH 57 done: loss 1.9050 - lr 0.0500000 2021-01-25 11:58:47,449 BAD EPOCHS (no improvement): 3 2021-01-25 11:59:22,452 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 11:59:28,190 epoch 58 - iter 11/111 - loss 1.79517363 - samples/sec: 30.83 - lr: 0.050000 2021-01-25 11:59:34,551 epoch 58 - iter 22/111 - loss 2.03854591 - samples/sec: 27.68 - lr: 0.050000 2021-01-25 11:59:40,414 epoch 58 - iter 33/111 - loss 1.89547655 - samples/sec: 30.03 - lr: 0.050000 2021-01-25 11:59:47,036 epoch 58 - iter 44/111 - loss 1.85825240 - samples/sec: 26.59 - lr: 0.050000 2021-01-25 11:59:53,117 epoch 58 - iter 55/111 - loss 1.79456089 - samples/sec: 28.95 - lr: 0.050000 2021-01-25 12:00:00,126 epoch 58 - iter 66/111 - loss 1.77742711 - samples/sec: 25.12 - lr: 0.050000 2021-01-25 12:00:05,420 epoch 58 - iter 77/111 - loss 1.72937530 - samples/sec: 33.26 - lr: 0.050000 2021-01-25 12:00:11,634 epoch 58 - iter 88/111 - loss 1.82790911 - samples/sec: 28.33 - lr: 0.050000 2021-01-25 12:00:17,666 epoch 58 - iter 99/111 - loss 1.82581005 - samples/sec: 29.19 - lr: 0.050000 2021-01-25 12:00:22,809 epoch 58 - iter 110/111 - loss 1.78818853 - samples/sec: 34.24 - lr: 0.050000 2021-01-25 12:00:22,867 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:00:22,868 EPOCH 58 done: loss 1.7723 - lr 0.0500000 2021-01-25 12:00:22,870 BAD EPOCHS (no improvement): 0 2021-01-25 12:01:03,210 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:01:09,614 epoch 59 - iter 11/111 - loss 1.78469042 - samples/sec: 27.61 - lr: 0.050000 2021-01-25 12:01:15,034 epoch 59 - iter 22/111 - loss 1.59042391 - samples/sec: 32.49 - lr: 0.050000 2021-01-25 12:01:21,084 epoch 59 - iter 33/111 - loss 1.69231301 - samples/sec: 29.10 - lr: 0.050000 2021-01-25 12:01:27,045 epoch 59 - iter 44/111 - loss 1.61682651 - samples/sec: 29.54 - lr: 0.050000 2021-01-25 12:01:34,204 epoch 59 - iter 55/111 - loss 1.61869046 - samples/sec: 24.59 - lr: 0.050000 2021-01-25 12:01:40,826 epoch 59 - iter 66/111 - loss 1.68842452 - samples/sec: 26.59 - lr: 0.050000 2021-01-25 12:01:45,881 epoch 59 - iter 77/111 - loss 1.71753615 - samples/sec: 34.84 - lr: 0.050000 2021-01-25 12:01:51,219 epoch 59 - iter 88/111 - loss 1.69533439 - samples/sec: 32.99 - lr: 0.050000 2021-01-25 12:01:56,906 epoch 59 - iter 99/111 - loss 1.71060796 - samples/sec: 30.96 - lr: 0.050000 2021-01-25 12:02:03,719 epoch 59 - iter 110/111 - loss 1.73168795 - samples/sec: 25.84 - lr: 0.050000 2021-01-25 12:02:03,903 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:02:03,904 EPOCH 59 done: loss 1.7166 - lr 0.0500000 2021-01-25 12:02:03,905 BAD EPOCHS (no improvement): 0 2021-01-25 12:02:36,087 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:02:42,328 epoch 60 - iter 11/111 - loss 1.95376955 - samples/sec: 28.33 - lr: 0.050000 2021-01-25 12:02:48,526 epoch 60 - iter 22/111 - loss 1.92651694 - samples/sec: 28.40 - lr: 0.050000 2021-01-25 12:02:53,946 epoch 60 - iter 33/111 - loss 1.79232361 - samples/sec: 32.49 - lr: 0.050000 2021-01-25 12:02:59,579 epoch 60 - iter 44/111 - loss 1.65399889 - samples/sec: 31.26 - lr: 0.050000 2021-01-25 12:03:04,989 epoch 60 - iter 55/111 - loss 1.61804408 - samples/sec: 32.55 - lr: 0.050000 2021-01-25 12:03:11,066 epoch 60 - iter 66/111 - loss 1.62668523 - samples/sec: 28.97 - lr: 0.050000 2021-01-25 12:03:16,375 epoch 60 - iter 77/111 - loss 1.60931349 - samples/sec: 33.17 - lr: 0.050000 2021-01-25 12:03:22,805 epoch 60 - iter 88/111 - loss 1.58886976 - samples/sec: 27.38 - lr: 0.050000 2021-01-25 12:03:29,315 epoch 60 - iter 99/111 - loss 1.62626700 - samples/sec: 27.05 - lr: 0.050000 2021-01-25 12:03:35,411 epoch 60 - iter 110/111 - loss 1.68461498 - samples/sec: 28.88 - lr: 0.050000 2021-01-25 12:03:35,482 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:03:35,483 EPOCH 60 done: loss 1.6695 - lr 0.0500000 2021-01-25 12:03:35,485 BAD EPOCHS (no improvement): 0 2021-01-25 12:04:15,089 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:04:22,464 epoch 61 - iter 11/111 - loss 1.90850128 - samples/sec: 23.96 - lr: 0.050000 2021-01-25 12:04:28,645 epoch 61 - iter 22/111 - loss 1.72968649 - samples/sec: 28.48 - lr: 0.050000 2021-01-25 12:04:34,772 epoch 61 - iter 33/111 - loss 1.75811358 - samples/sec: 28.74 - lr: 0.050000 2021-01-25 12:04:39,643 epoch 61 - iter 44/111 - loss 1.69857002 - samples/sec: 36.15 - lr: 0.050000 2021-01-25 12:04:45,914 epoch 61 - iter 55/111 - loss 1.67391535 - samples/sec: 28.07 - lr: 0.050000 2021-01-25 12:04:52,660 epoch 61 - iter 66/111 - loss 1.75760016 - samples/sec: 26.10 - lr: 0.050000 2021-01-25 12:04:58,253 epoch 61 - iter 77/111 - loss 1.75459774 - samples/sec: 31.48 - lr: 0.050000 2021-01-25 12:05:04,462 epoch 61 - iter 88/111 - loss 1.73564435 - samples/sec: 28.36 - lr: 0.050000 2021-01-25 12:05:09,581 epoch 61 - iter 99/111 - loss 1.72145211 - samples/sec: 34.40 - lr: 0.050000 2021-01-25 12:05:15,532 epoch 61 - iter 110/111 - loss 1.70552521 - samples/sec: 29.58 - lr: 0.050000 2021-01-25 12:05:15,585 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:05:15,587 EPOCH 61 done: loss 1.6931 - lr 0.0500000 2021-01-25 12:05:15,588 BAD EPOCHS (no improvement): 1 2021-01-25 12:05:49,501 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:05:55,127 epoch 62 - iter 11/111 - loss 1.47604708 - samples/sec: 31.44 - lr: 0.050000 2021-01-25 12:06:01,665 epoch 62 - iter 22/111 - loss 1.68279212 - samples/sec: 26.93 - lr: 0.050000 2021-01-25 12:06:08,163 epoch 62 - iter 33/111 - loss 1.59796740 - samples/sec: 27.10 - lr: 0.050000 2021-01-25 12:06:13,994 epoch 62 - iter 44/111 - loss 1.60600110 - samples/sec: 30.20 - lr: 0.050000 2021-01-25 12:06:18,996 epoch 62 - iter 55/111 - loss 1.54157640 - samples/sec: 35.20 - lr: 0.050000 2021-01-25 12:06:24,611 epoch 62 - iter 66/111 - loss 1.53799889 - samples/sec: 31.36 - lr: 0.050000 2021-01-25 12:06:30,406 epoch 62 - iter 77/111 - loss 1.60738051 - samples/sec: 30.38 - lr: 0.050000 2021-01-25 12:06:36,796 epoch 62 - iter 88/111 - loss 1.59562062 - samples/sec: 27.55 - lr: 0.050000 2021-01-25 12:06:42,410 epoch 62 - iter 99/111 - loss 1.62996258 - samples/sec: 31.36 - lr: 0.050000 2021-01-25 12:06:48,487 epoch 62 - iter 110/111 - loss 1.60414507 - samples/sec: 28.97 - lr: 0.050000 2021-01-25 12:06:48,551 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:06:48,553 EPOCH 62 done: loss 1.5897 - lr 0.0500000 2021-01-25 12:06:48,553 BAD EPOCHS (no improvement): 0 2021-01-25 12:07:25,633 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:07:30,821 epoch 63 - iter 11/111 - loss 1.41043627 - samples/sec: 34.11 - lr: 0.050000 2021-01-25 12:07:37,602 epoch 63 - iter 22/111 - loss 1.61582136 - samples/sec: 25.96 - lr: 0.050000 2021-01-25 12:07:43,844 epoch 63 - iter 33/111 - loss 1.61856750 - samples/sec: 28.21 - lr: 0.050000 2021-01-25 12:07:49,849 epoch 63 - iter 44/111 - loss 1.65706825 - samples/sec: 29.32 - lr: 0.050000 2021-01-25 12:07:55,849 epoch 63 - iter 55/111 - loss 1.62137602 - samples/sec: 29.34 - lr: 0.050000 2021-01-25 12:08:02,089 epoch 63 - iter 66/111 - loss 1.61913779 - samples/sec: 28.22 - lr: 0.050000 2021-01-25 12:08:08,142 epoch 63 - iter 77/111 - loss 1.64572622 - samples/sec: 29.09 - lr: 0.050000 2021-01-25 12:08:12,926 epoch 63 - iter 88/111 - loss 1.61507814 - samples/sec: 36.80 - lr: 0.050000 2021-01-25 12:08:18,906 epoch 63 - iter 99/111 - loss 1.56996080 - samples/sec: 29.44 - lr: 0.050000 2021-01-25 12:08:24,616 epoch 63 - iter 110/111 - loss 1.56164694 - samples/sec: 30.84 - lr: 0.050000 2021-01-25 12:08:24,676 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:08:24,678 EPOCH 63 done: loss 1.5517 - lr 0.0500000 2021-01-25 12:08:24,679 BAD EPOCHS (no improvement): 0 2021-01-25 12:09:05,938 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:09:11,978 epoch 64 - iter 11/111 - loss 1.24807806 - samples/sec: 29.27 - lr: 0.050000 2021-01-25 12:09:18,190 epoch 64 - iter 22/111 - loss 1.44314513 - samples/sec: 28.34 - lr: 0.050000 2021-01-25 12:09:24,071 epoch 64 - iter 33/111 - loss 1.44731416 - samples/sec: 29.94 - lr: 0.050000 2021-01-25 12:09:29,935 epoch 64 - iter 44/111 - loss 1.46721514 - samples/sec: 30.03 - lr: 0.050000 2021-01-25 12:09:35,783 epoch 64 - iter 55/111 - loss 1.50602719 - samples/sec: 30.11 - lr: 0.050000 2021-01-25 12:09:42,125 epoch 64 - iter 66/111 - loss 1.54318994 - samples/sec: 27.76 - lr: 0.050000 2021-01-25 12:09:48,256 epoch 64 - iter 77/111 - loss 1.53517920 - samples/sec: 28.72 - lr: 0.050000 2021-01-25 12:09:53,515 epoch 64 - iter 88/111 - loss 1.51558204 - samples/sec: 33.48 - lr: 0.050000 2021-01-25 12:09:59,810 epoch 64 - iter 99/111 - loss 1.49825442 - samples/sec: 27.97 - lr: 0.050000 2021-01-25 12:10:06,023 epoch 64 - iter 110/111 - loss 1.50593701 - samples/sec: 28.34 - lr: 0.050000 2021-01-25 12:10:06,124 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:10:06,127 EPOCH 64 done: loss 1.5262 - lr 0.0500000 2021-01-25 12:10:06,129 BAD EPOCHS (no improvement): 0 2021-01-25 12:10:44,644 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:10:50,695 epoch 65 - iter 11/111 - loss 1.73282667 - samples/sec: 29.22 - lr: 0.050000 2021-01-25 12:10:55,916 epoch 65 - iter 22/111 - loss 1.44079784 - samples/sec: 33.73 - lr: 0.050000 2021-01-25 12:11:01,300 epoch 65 - iter 33/111 - loss 1.40507281 - samples/sec: 32.70 - lr: 0.050000 2021-01-25 12:11:06,907 epoch 65 - iter 44/111 - loss 1.49278208 - samples/sec: 31.40 - lr: 0.050000 2021-01-25 12:11:13,041 epoch 65 - iter 55/111 - loss 1.52757028 - samples/sec: 28.70 - lr: 0.050000 2021-01-25 12:11:18,389 epoch 65 - iter 66/111 - loss 1.54195133 - samples/sec: 32.92 - lr: 0.050000 2021-01-25 12:11:24,374 epoch 65 - iter 77/111 - loss 1.58311635 - samples/sec: 29.42 - lr: 0.050000 2021-01-25 12:11:30,985 epoch 65 - iter 88/111 - loss 1.59606863 - samples/sec: 26.64 - lr: 0.050000 2021-01-25 12:11:36,844 epoch 65 - iter 99/111 - loss 1.61974826 - samples/sec: 30.05 - lr: 0.050000 2021-01-25 12:11:42,969 epoch 65 - iter 110/111 - loss 1.59046479 - samples/sec: 28.75 - lr: 0.050000 2021-01-25 12:11:43,042 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:11:43,044 EPOCH 65 done: loss 1.5793 - lr 0.0500000 2021-01-25 12:11:43,045 BAD EPOCHS (no improvement): 1 2021-01-25 12:12:15,978 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:12:22,464 epoch 66 - iter 11/111 - loss 1.79513772 - samples/sec: 27.26 - lr: 0.050000 2021-01-25 12:12:28,543 epoch 66 - iter 22/111 - loss 1.90367814 - samples/sec: 28.96 - lr: 0.050000 2021-01-25 12:12:34,907 epoch 66 - iter 33/111 - loss 1.72589379 - samples/sec: 27.67 - lr: 0.050000 2021-01-25 12:12:40,988 epoch 66 - iter 44/111 - loss 1.69216063 - samples/sec: 28.96 - lr: 0.050000 2021-01-25 12:12:46,911 epoch 66 - iter 55/111 - loss 1.65075056 - samples/sec: 29.73 - lr: 0.050000 2021-01-25 12:12:53,221 epoch 66 - iter 66/111 - loss 1.61192947 - samples/sec: 27.90 - lr: 0.050000 2021-01-25 12:12:59,781 epoch 66 - iter 77/111 - loss 1.55927261 - samples/sec: 26.84 - lr: 0.050000 2021-01-25 12:13:04,976 epoch 66 - iter 88/111 - loss 1.50604288 - samples/sec: 33.89 - lr: 0.050000 2021-01-25 12:13:11,319 epoch 66 - iter 99/111 - loss 1.52530791 - samples/sec: 27.76 - lr: 0.050000 2021-01-25 12:13:17,317 epoch 66 - iter 110/111 - loss 1.55038137 - samples/sec: 29.36 - lr: 0.050000 2021-01-25 12:13:17,366 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:13:17,367 EPOCH 66 done: loss 1.5365 - lr 0.0500000 2021-01-25 12:13:17,369 BAD EPOCHS (no improvement): 2 2021-01-25 12:13:57,777 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:14:04,088 epoch 67 - iter 11/111 - loss 1.51214162 - samples/sec: 28.01 - lr: 0.050000 2021-01-25 12:14:10,544 epoch 67 - iter 22/111 - loss 1.52961288 - samples/sec: 27.27 - lr: 0.050000 2021-01-25 12:14:16,437 epoch 67 - iter 33/111 - loss 1.55562648 - samples/sec: 29.88 - lr: 0.050000 2021-01-25 12:14:21,526 epoch 67 - iter 44/111 - loss 1.51856806 - samples/sec: 34.60 - lr: 0.050000 2021-01-25 12:14:27,248 epoch 67 - iter 55/111 - loss 1.54293376 - samples/sec: 30.77 - lr: 0.050000 2021-01-25 12:14:33,281 epoch 67 - iter 66/111 - loss 1.50277544 - samples/sec: 29.18 - lr: 0.050000 2021-01-25 12:14:39,342 epoch 67 - iter 77/111 - loss 1.50794357 - samples/sec: 29.05 - lr: 0.050000 2021-01-25 12:14:44,751 epoch 67 - iter 88/111 - loss 1.49891747 - samples/sec: 32.55 - lr: 0.050000 2021-01-25 12:14:50,269 epoch 67 - iter 99/111 - loss 1.47982434 - samples/sec: 31.91 - lr: 0.050000 2021-01-25 12:14:56,320 epoch 67 - iter 110/111 - loss 1.51493590 - samples/sec: 29.10 - lr: 0.050000 2021-01-25 12:14:56,449 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:14:56,451 EPOCH 67 done: loss 1.5194 - lr 0.0500000 2021-01-25 12:14:56,452 BAD EPOCHS (no improvement): 0 2021-01-25 12:15:36,459 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:15:43,678 epoch 68 - iter 11/111 - loss 1.43217570 - samples/sec: 24.47 - lr: 0.050000 2021-01-25 12:15:50,337 epoch 68 - iter 22/111 - loss 1.49155569 - samples/sec: 26.44 - lr: 0.050000 2021-01-25 12:15:54,938 epoch 68 - iter 33/111 - loss 1.42927919 - samples/sec: 38.27 - lr: 0.050000 2021-01-25 12:16:00,816 epoch 68 - iter 44/111 - loss 1.45040369 - samples/sec: 29.96 - lr: 0.050000 2021-01-25 12:16:06,456 epoch 68 - iter 55/111 - loss 1.49330547 - samples/sec: 31.22 - lr: 0.050000 2021-01-25 12:16:12,306 epoch 68 - iter 66/111 - loss 1.51564600 - samples/sec: 30.10 - lr: 0.050000 2021-01-25 12:16:17,722 epoch 68 - iter 77/111 - loss 1.50499360 - samples/sec: 32.51 - lr: 0.050000 2021-01-25 12:16:23,306 epoch 68 - iter 88/111 - loss 1.52004683 - samples/sec: 31.53 - lr: 0.050000 2021-01-25 12:16:30,038 epoch 68 - iter 99/111 - loss 1.51634471 - samples/sec: 26.15 - lr: 0.050000 2021-01-25 12:16:35,717 epoch 68 - iter 110/111 - loss 1.51642235 - samples/sec: 31.00 - lr: 0.050000 2021-01-25 12:16:35,807 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:16:35,809 EPOCH 68 done: loss 1.5038 - lr 0.0500000 2021-01-25 12:16:35,810 BAD EPOCHS (no improvement): 0 2021-01-25 12:17:08,298 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:17:23,819 epoch 69 - iter 11/111 - loss 1.46120588 - samples/sec: 25.60 - lr: 0.050000 2021-01-25 12:17:29,891 epoch 69 - iter 22/111 - loss 1.44839512 - samples/sec: 28.99 - lr: 0.050000 2021-01-25 12:17:34,765 epoch 69 - iter 33/111 - loss 1.38297283 - samples/sec: 36.12 - lr: 0.050000 2021-01-25 12:17:40,116 epoch 69 - iter 44/111 - loss 1.41795017 - samples/sec: 32.91 - lr: 0.050000 2021-01-25 12:17:45,776 epoch 69 - iter 55/111 - loss 1.37325392 - samples/sec: 31.11 - lr: 0.050000 2021-01-25 12:17:51,485 epoch 69 - iter 66/111 - loss 1.38036001 - samples/sec: 30.84 - lr: 0.050000 2021-01-25 12:17:57,869 epoch 69 - iter 77/111 - loss 1.38968347 - samples/sec: 27.58 - lr: 0.050000 2021-01-25 12:19:52,411 epoch 69 - iter 88/111 - loss 1.38374551 - samples/sec: 26.41 - lr: 0.050000 2021-01-25 12:19:58,278 epoch 69 - iter 99/111 - loss 1.41925592 - samples/sec: 30.02 - lr: 0.050000 2021-01-25 12:20:04,925 epoch 69 - iter 110/111 - loss 1.43710668 - samples/sec: 26.49 - lr: 0.050000 2021-01-25 12:20:05,007 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:20:05,008 EPOCH 69 done: loss 1.4243 - lr 0.0500000 2021-01-25 12:20:05,010 BAD EPOCHS (no improvement): 0 2021-01-25 12:20:45,483 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:20:52,255 epoch 70 - iter 11/111 - loss 1.44516954 - samples/sec: 26.16 - lr: 0.050000 2021-01-25 12:20:57,079 epoch 70 - iter 22/111 - loss 1.38778281 - samples/sec: 36.50 - lr: 0.050000 2021-01-25 12:21:03,712 epoch 70 - iter 33/111 - loss 1.52580733 - samples/sec: 26.55 - lr: 0.050000 2021-01-25 12:21:08,702 epoch 70 - iter 44/111 - loss 1.40874570 - samples/sec: 35.28 - lr: 0.050000 2021-01-25 12:21:14,550 epoch 70 - iter 55/111 - loss 1.37405204 - samples/sec: 30.11 - lr: 0.050000 2021-01-25 12:21:20,932 epoch 70 - iter 66/111 - loss 1.40803984 - samples/sec: 27.59 - lr: 0.050000 2021-01-25 12:21:26,944 epoch 70 - iter 77/111 - loss 1.42147764 - samples/sec: 29.28 - lr: 0.050000 2021-01-25 12:21:32,824 epoch 70 - iter 88/111 - loss 1.42631083 - samples/sec: 29.94 - lr: 0.050000 2021-01-25 12:21:39,197 epoch 70 - iter 99/111 - loss 1.43104338 - samples/sec: 27.63 - lr: 0.050000 2021-01-25 12:21:44,725 epoch 70 - iter 110/111 - loss 1.43146756 - samples/sec: 31.85 - lr: 0.050000 2021-01-25 12:21:44,914 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:21:44,915 EPOCH 70 done: loss 1.4211 - lr 0.0500000 2021-01-25 12:21:44,916 BAD EPOCHS (no improvement): 0 2021-01-25 12:22:17,445 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:22:22,976 epoch 71 - iter 11/111 - loss 2.00260350 - samples/sec: 31.98 - lr: 0.050000 2021-01-25 12:22:29,761 epoch 71 - iter 22/111 - loss 1.70493918 - samples/sec: 25.95 - lr: 0.050000 2021-01-25 12:22:35,731 epoch 71 - iter 33/111 - loss 1.59428979 - samples/sec: 29.49 - lr: 0.050000 2021-01-25 12:22:42,001 epoch 71 - iter 44/111 - loss 1.59676667 - samples/sec: 28.08 - lr: 0.050000 2021-01-25 12:22:47,396 epoch 71 - iter 55/111 - loss 1.49150653 - samples/sec: 32.64 - lr: 0.050000 2021-01-25 12:22:53,190 epoch 71 - iter 66/111 - loss 1.47964654 - samples/sec: 30.38 - lr: 0.050000 2021-01-25 12:22:59,851 epoch 71 - iter 77/111 - loss 1.48795519 - samples/sec: 26.43 - lr: 0.050000 2021-01-25 12:23:05,477 epoch 71 - iter 88/111 - loss 1.47494126 - samples/sec: 31.30 - lr: 0.050000 2021-01-25 12:23:12,100 epoch 71 - iter 99/111 - loss 1.45548201 - samples/sec: 26.58 - lr: 0.050000 2021-01-25 12:23:18,307 epoch 71 - iter 110/111 - loss 1.42907549 - samples/sec: 28.36 - lr: 0.050000 2021-01-25 12:23:18,550 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:23:18,551 EPOCH 71 done: loss 1.4177 - lr 0.0500000 2021-01-25 12:23:18,552 BAD EPOCHS (no improvement): 0 2021-01-25 12:23:59,318 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:24:04,480 epoch 72 - iter 11/111 - loss 1.22978234 - samples/sec: 34.29 - lr: 0.050000 2021-01-25 12:24:11,145 epoch 72 - iter 22/111 - loss 1.39237714 - samples/sec: 26.41 - lr: 0.050000 2021-01-25 12:24:16,552 epoch 72 - iter 33/111 - loss 1.43349531 - samples/sec: 32.56 - lr: 0.050000 2021-01-25 12:24:23,077 epoch 72 - iter 44/111 - loss 1.41425940 - samples/sec: 26.98 - lr: 0.050000 2021-01-25 12:24:28,777 epoch 72 - iter 55/111 - loss 1.44107326 - samples/sec: 30.89 - lr: 0.050000 2021-01-25 12:24:34,775 epoch 72 - iter 66/111 - loss 1.42940595 - samples/sec: 29.35 - lr: 0.050000 2021-01-25 12:24:40,329 epoch 72 - iter 77/111 - loss 1.43958457 - samples/sec: 31.70 - lr: 0.050000 2021-01-25 12:24:46,854 epoch 72 - iter 88/111 - loss 1.45229586 - samples/sec: 26.98 - lr: 0.050000 2021-01-25 12:24:52,907 epoch 72 - iter 99/111 - loss 1.44572394 - samples/sec: 29.09 - lr: 0.050000 2021-01-25 12:24:58,648 epoch 72 - iter 110/111 - loss 1.44074938 - samples/sec: 30.67 - lr: 0.050000 2021-01-25 12:24:58,690 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:24:58,692 EPOCH 72 done: loss 1.4387 - lr 0.0500000 2021-01-25 12:24:58,693 BAD EPOCHS (no improvement): 1 2021-01-25 12:25:39,614 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:25:45,237 epoch 73 - iter 11/111 - loss 1.29489860 - samples/sec: 31.47 - lr: 0.050000 2021-01-25 12:25:51,043 epoch 73 - iter 22/111 - loss 1.38107881 - samples/sec: 30.32 - lr: 0.050000 2021-01-25 12:25:57,015 epoch 73 - iter 33/111 - loss 1.45272314 - samples/sec: 29.48 - lr: 0.050000 2021-01-25 12:26:03,598 epoch 73 - iter 44/111 - loss 1.45398960 - samples/sec: 26.75 - lr: 0.050000 2021-01-25 12:26:08,903 epoch 73 - iter 55/111 - loss 1.37994436 - samples/sec: 33.19 - lr: 0.050000 2021-01-25 12:26:14,550 epoch 73 - iter 66/111 - loss 1.45808953 - samples/sec: 31.18 - lr: 0.050000 2021-01-25 12:26:21,631 epoch 73 - iter 77/111 - loss 1.53841408 - samples/sec: 24.86 - lr: 0.050000 2021-01-25 12:26:27,564 epoch 73 - iter 88/111 - loss 1.52056252 - samples/sec: 29.68 - lr: 0.050000 2021-01-25 12:26:33,935 epoch 73 - iter 99/111 - loss 1.49743945 - samples/sec: 27.63 - lr: 0.050000 2021-01-25 12:26:39,597 epoch 73 - iter 110/111 - loss 1.49375933 - samples/sec: 31.10 - lr: 0.050000 2021-01-25 12:26:39,644 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:26:39,645 EPOCH 73 done: loss 1.4804 - lr 0.0500000 2021-01-25 12:26:39,647 BAD EPOCHS (no improvement): 2 2021-01-25 12:27:19,117 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:27:24,603 epoch 74 - iter 11/111 - loss 1.18496536 - samples/sec: 32.24 - lr: 0.050000 2021-01-25 12:27:30,495 epoch 74 - iter 22/111 - loss 1.30861269 - samples/sec: 29.88 - lr: 0.050000 2021-01-25 12:27:36,357 epoch 74 - iter 33/111 - loss 1.35666224 - samples/sec: 30.03 - lr: 0.050000 2021-01-25 12:27:41,963 epoch 74 - iter 44/111 - loss 1.43150187 - samples/sec: 31.41 - lr: 0.050000 2021-01-25 12:27:48,820 epoch 74 - iter 55/111 - loss 1.41136729 - samples/sec: 25.67 - lr: 0.050000 2021-01-25 12:27:55,538 epoch 74 - iter 66/111 - loss 1.40032356 - samples/sec: 26.21 - lr: 0.050000 2021-01-25 12:28:01,951 epoch 74 - iter 77/111 - loss 1.37194948 - samples/sec: 27.45 - lr: 0.050000 2021-01-25 12:28:07,992 epoch 74 - iter 88/111 - loss 1.39479629 - samples/sec: 29.14 - lr: 0.050000 2021-01-25 12:28:13,299 epoch 74 - iter 99/111 - loss 1.46021123 - samples/sec: 33.18 - lr: 0.050000 2021-01-25 12:28:20,201 epoch 74 - iter 110/111 - loss 1.49201296 - samples/sec: 25.51 - lr: 0.050000 2021-01-25 12:28:20,368 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:28:20,369 EPOCH 74 done: loss 1.4923 - lr 0.0500000 2021-01-25 12:28:20,371 BAD EPOCHS (no improvement): 3 2021-01-25 12:28:59,402 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:29:05,235 epoch 75 - iter 11/111 - loss 1.30246717 - samples/sec: 30.33 - lr: 0.050000 2021-01-25 12:29:11,617 epoch 75 - iter 22/111 - loss 1.54428654 - samples/sec: 27.59 - lr: 0.050000 2021-01-25 12:29:17,551 epoch 75 - iter 33/111 - loss 1.36524455 - samples/sec: 29.67 - lr: 0.050000 2021-01-25 12:29:23,373 epoch 75 - iter 44/111 - loss 1.37502711 - samples/sec: 30.24 - lr: 0.050000 2021-01-25 12:29:28,999 epoch 75 - iter 55/111 - loss 1.29860503 - samples/sec: 31.30 - lr: 0.050000 2021-01-25 12:29:35,953 epoch 75 - iter 66/111 - loss 1.30761962 - samples/sec: 25.32 - lr: 0.050000 2021-01-25 12:29:41,391 epoch 75 - iter 77/111 - loss 1.33167293 - samples/sec: 32.38 - lr: 0.050000 2021-01-25 12:29:46,665 epoch 75 - iter 88/111 - loss 1.33132098 - samples/sec: 33.38 - lr: 0.050000 2021-01-25 12:29:52,907 epoch 75 - iter 99/111 - loss 1.32298015 - samples/sec: 28.20 - lr: 0.050000 2021-01-25 12:29:59,054 epoch 75 - iter 110/111 - loss 1.31384700 - samples/sec: 28.64 - lr: 0.050000 2021-01-25 12:29:59,260 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:29:59,261 EPOCH 75 done: loss 1.3076 - lr 0.0500000 2021-01-25 12:29:59,263 BAD EPOCHS (no improvement): 0 2021-01-25 12:30:36,810 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:30:42,805 epoch 76 - iter 11/111 - loss 1.33615736 - samples/sec: 29.50 - lr: 0.050000 2021-01-25 12:30:48,327 epoch 76 - iter 22/111 - loss 1.26157622 - samples/sec: 31.89 - lr: 0.050000 2021-01-25 12:30:53,809 epoch 76 - iter 33/111 - loss 1.28425501 - samples/sec: 32.11 - lr: 0.050000 2021-01-25 12:31:00,729 epoch 76 - iter 44/111 - loss 1.25888383 - samples/sec: 25.44 - lr: 0.050000 2021-01-25 12:31:06,217 epoch 76 - iter 55/111 - loss 1.26368131 - samples/sec: 32.08 - lr: 0.050000 2021-01-25 12:31:12,685 epoch 76 - iter 66/111 - loss 1.28373529 - samples/sec: 27.22 - lr: 0.050000 2021-01-25 12:31:18,270 epoch 76 - iter 77/111 - loss 1.28456146 - samples/sec: 31.52 - lr: 0.050000 2021-01-25 12:31:24,420 epoch 76 - iter 88/111 - loss 1.28954828 - samples/sec: 28.63 - lr: 0.050000 2021-01-25 12:31:31,033 epoch 76 - iter 99/111 - loss 1.29948332 - samples/sec: 26.62 - lr: 0.050000 2021-01-25 12:31:37,305 epoch 76 - iter 110/111 - loss 1.33390515 - samples/sec: 28.07 - lr: 0.050000 2021-01-25 12:31:37,399 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:31:37,401 EPOCH 76 done: loss 1.4266 - lr 0.0500000 2021-01-25 12:31:37,403 BAD EPOCHS (no improvement): 1 2021-01-25 12:32:12,559 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:32:17,704 epoch 77 - iter 11/111 - loss 0.98876150 - samples/sec: 34.41 - lr: 0.050000 2021-01-25 12:32:24,636 epoch 77 - iter 22/111 - loss 1.17457396 - samples/sec: 25.40 - lr: 0.050000 2021-01-25 12:32:30,628 epoch 77 - iter 33/111 - loss 1.16359689 - samples/sec: 29.38 - lr: 0.050000 2021-01-25 12:32:36,169 epoch 77 - iter 44/111 - loss 1.29733602 - samples/sec: 31.77 - lr: 0.050000 2021-01-25 12:32:42,034 epoch 77 - iter 55/111 - loss 1.29096426 - samples/sec: 30.02 - lr: 0.050000 2021-01-25 12:32:48,172 epoch 77 - iter 66/111 - loss 1.31050428 - samples/sec: 28.69 - lr: 0.050000 2021-01-25 12:32:54,089 epoch 77 - iter 77/111 - loss 1.30478336 - samples/sec: 29.75 - lr: 0.050000 2021-01-25 12:32:59,821 epoch 77 - iter 88/111 - loss 1.30426656 - samples/sec: 30.72 - lr: 0.050000 2021-01-25 12:33:06,093 epoch 77 - iter 99/111 - loss 1.30662050 - samples/sec: 28.07 - lr: 0.050000 2021-01-25 12:33:12,647 epoch 77 - iter 110/111 - loss 1.32664387 - samples/sec: 26.86 - lr: 0.050000 2021-01-25 12:33:12,709 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:33:12,710 EPOCH 77 done: loss 1.3474 - lr 0.0500000 2021-01-25 12:33:12,711 BAD EPOCHS (no improvement): 2 2021-01-25 12:33:50,722 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:33:57,284 epoch 78 - iter 11/111 - loss 1.29088527 - samples/sec: 26.94 - lr: 0.050000 2021-01-25 12:34:02,952 epoch 78 - iter 22/111 - loss 1.31935661 - samples/sec: 31.07 - lr: 0.050000 2021-01-25 12:34:09,106 epoch 78 - iter 33/111 - loss 1.41321889 - samples/sec: 28.61 - lr: 0.050000 2021-01-25 12:34:14,008 epoch 78 - iter 44/111 - loss 1.39564167 - samples/sec: 35.92 - lr: 0.050000 2021-01-25 12:34:19,420 epoch 78 - iter 55/111 - loss 1.34922395 - samples/sec: 32.53 - lr: 0.050000 2021-01-25 12:34:25,401 epoch 78 - iter 66/111 - loss 1.32606339 - samples/sec: 29.44 - lr: 0.050000 2021-01-25 12:34:30,962 epoch 78 - iter 77/111 - loss 1.28154114 - samples/sec: 31.66 - lr: 0.050000 2021-01-25 12:34:37,444 epoch 78 - iter 88/111 - loss 1.34347282 - samples/sec: 27.16 - lr: 0.050000 2021-01-25 12:34:44,194 epoch 78 - iter 99/111 - loss 1.38436094 - samples/sec: 26.08 - lr: 0.050000 2021-01-25 12:34:49,965 epoch 78 - iter 110/111 - loss 1.37429725 - samples/sec: 30.51 - lr: 0.050000 2021-01-25 12:34:50,108 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:34:50,109 EPOCH 78 done: loss 1.3837 - lr 0.0500000 2021-01-25 12:34:50,111 BAD EPOCHS (no improvement): 3 2021-01-25 12:35:23,788 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:35:29,899 epoch 79 - iter 11/111 - loss 1.10409809 - samples/sec: 28.94 - lr: 0.050000 2021-01-25 12:35:36,740 epoch 79 - iter 22/111 - loss 1.17709787 - samples/sec: 25.74 - lr: 0.050000 2021-01-25 12:35:42,238 epoch 79 - iter 33/111 - loss 1.19678932 - samples/sec: 32.03 - lr: 0.050000 2021-01-25 12:35:47,630 epoch 79 - iter 44/111 - loss 1.22716568 - samples/sec: 32.66 - lr: 0.050000 2021-01-25 12:35:54,193 epoch 79 - iter 55/111 - loss 1.27254297 - samples/sec: 26.83 - lr: 0.050000 2021-01-25 12:35:59,749 epoch 79 - iter 66/111 - loss 1.28610185 - samples/sec: 31.69 - lr: 0.050000 2021-01-25 12:36:05,686 epoch 79 - iter 77/111 - loss 1.32574023 - samples/sec: 29.66 - lr: 0.050000 2021-01-25 12:36:11,627 epoch 79 - iter 88/111 - loss 1.34977880 - samples/sec: 29.64 - lr: 0.050000 2021-01-25 12:36:17,715 epoch 79 - iter 99/111 - loss 1.35951786 - samples/sec: 28.92 - lr: 0.050000 2021-01-25 12:36:24,854 epoch 79 - iter 110/111 - loss 1.36119310 - samples/sec: 24.66 - lr: 0.050000 2021-01-25 12:36:24,992 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:36:24,993 EPOCH 79 done: loss 1.3646 - lr 0.0500000 Epoch 79: reducing learning rate of group 0 to 2.5000e-02. 2021-01-25 12:36:24,995 BAD EPOCHS (no improvement): 4 2021-01-25 12:36:59,260 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:37:04,238 epoch 80 - iter 11/111 - loss 1.24194489 - samples/sec: 35.58 - lr: 0.025000 2021-01-25 12:37:12,399 epoch 80 - iter 22/111 - loss 1.32671146 - samples/sec: 30.09 - lr: 0.025000 2021-01-25 12:37:18,386 epoch 80 - iter 33/111 - loss 1.30666058 - samples/sec: 29.41 - lr: 0.025000 2021-01-25 12:37:24,855 epoch 80 - iter 44/111 - loss 1.27289238 - samples/sec: 27.22 - lr: 0.025000 2021-01-25 12:37:31,314 epoch 80 - iter 55/111 - loss 1.27664626 - samples/sec: 27.26 - lr: 0.025000 2021-01-25 12:37:37,028 epoch 80 - iter 66/111 - loss 1.27360224 - samples/sec: 30.82 - lr: 0.025000 2021-01-25 12:37:43,728 epoch 80 - iter 77/111 - loss 1.30810210 - samples/sec: 26.28 - lr: 0.025000 2021-01-25 12:37:48,715 epoch 80 - iter 88/111 - loss 1.28021682 - samples/sec: 35.32 - lr: 0.025000 2021-01-25 12:37:54,785 epoch 80 - iter 99/111 - loss 1.24763311 - samples/sec: 29.00 - lr: 0.025000 2021-01-25 12:38:00,486 epoch 80 - iter 110/111 - loss 1.23794550 - samples/sec: 30.89 - lr: 0.025000 2021-01-25 12:38:00,539 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:38:00,540 EPOCH 80 done: loss 1.2270 - lr 0.0250000 2021-01-25 12:38:00,541 BAD EPOCHS (no improvement): 0 2021-01-25 12:38:35,050 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:38:42,032 epoch 81 - iter 11/111 - loss 1.33562588 - samples/sec: 25.31 - lr: 0.025000 2021-01-25 12:38:48,816 epoch 81 - iter 22/111 - loss 1.46655772 - samples/sec: 25.95 - lr: 0.025000 2021-01-25 12:38:55,174 epoch 81 - iter 33/111 - loss 1.30066396 - samples/sec: 27.69 - lr: 0.025000 2021-01-25 12:39:00,930 epoch 81 - iter 44/111 - loss 1.23842647 - samples/sec: 30.59 - lr: 0.025000 2021-01-25 12:39:06,494 epoch 81 - iter 55/111 - loss 1.22651872 - samples/sec: 31.65 - lr: 0.025000 2021-01-25 12:39:12,684 epoch 81 - iter 66/111 - loss 1.19976663 - samples/sec: 28.44 - lr: 0.025000 2021-01-25 12:39:18,015 epoch 81 - iter 77/111 - loss 1.18203226 - samples/sec: 33.03 - lr: 0.025000 2021-01-25 12:39:24,241 epoch 81 - iter 88/111 - loss 1.19300186 - samples/sec: 28.28 - lr: 0.025000 2021-01-25 12:39:29,658 epoch 81 - iter 99/111 - loss 1.20260931 - samples/sec: 32.50 - lr: 0.025000 2021-01-25 12:39:35,189 epoch 81 - iter 110/111 - loss 1.18722751 - samples/sec: 31.83 - lr: 0.025000 2021-01-25 12:39:35,261 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:39:35,262 EPOCH 81 done: loss 1.1765 - lr 0.0250000 2021-01-25 12:39:35,264 BAD EPOCHS (no improvement): 0 2021-01-25 12:40:12,964 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:40:19,513 epoch 82 - iter 11/111 - loss 1.05421389 - samples/sec: 26.99 - lr: 0.025000 2021-01-25 12:40:25,330 epoch 82 - iter 22/111 - loss 1.00502741 - samples/sec: 30.27 - lr: 0.025000 2021-01-25 12:40:31,044 epoch 82 - iter 33/111 - loss 1.04035564 - samples/sec: 30.81 - lr: 0.025000 2021-01-25 12:40:37,956 epoch 82 - iter 44/111 - loss 1.08949315 - samples/sec: 25.47 - lr: 0.025000 2021-01-25 12:40:44,555 epoch 82 - iter 55/111 - loss 1.12894638 - samples/sec: 26.68 - lr: 0.025000 2021-01-25 12:40:51,918 epoch 82 - iter 66/111 - loss 1.13150503 - samples/sec: 23.91 - lr: 0.025000 2021-01-25 12:40:57,240 epoch 82 - iter 77/111 - loss 1.13121013 - samples/sec: 33.08 - lr: 0.025000 2021-01-25 12:41:02,235 epoch 82 - iter 88/111 - loss 1.08271775 - samples/sec: 35.25 - lr: 0.025000 2021-01-25 12:41:08,495 epoch 82 - iter 99/111 - loss 1.08136557 - samples/sec: 28.12 - lr: 0.025000 2021-01-25 12:41:14,669 epoch 82 - iter 110/111 - loss 1.08425220 - samples/sec: 28.52 - lr: 0.025000 2021-01-25 12:41:14,785 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:41:14,786 EPOCH 82 done: loss 1.0751 - lr 0.0250000 2021-01-25 12:41:14,787 BAD EPOCHS (no improvement): 0 2021-01-25 12:41:48,715 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:41:55,554 epoch 83 - iter 11/111 - loss 0.94434114 - samples/sec: 25.84 - lr: 0.025000 2021-01-25 12:42:01,527 epoch 83 - iter 22/111 - loss 0.90104175 - samples/sec: 29.48 - lr: 0.025000 2021-01-25 12:42:09,255 epoch 83 - iter 33/111 - loss 0.93821691 - samples/sec: 22.78 - lr: 0.025000 2021-01-25 12:42:15,150 epoch 83 - iter 44/111 - loss 0.96491798 - samples/sec: 29.87 - lr: 0.025000 2021-01-25 12:42:21,144 epoch 83 - iter 55/111 - loss 0.96957205 - samples/sec: 29.38 - lr: 0.025000 2021-01-25 12:42:28,518 epoch 83 - iter 66/111 - loss 1.03643710 - samples/sec: 23.88 - lr: 0.025000 2021-01-25 12:42:35,475 epoch 83 - iter 77/111 - loss 1.04713983 - samples/sec: 25.31 - lr: 0.025000 2021-01-25 12:42:41,878 epoch 83 - iter 88/111 - loss 1.07007285 - samples/sec: 27.50 - lr: 0.025000 2021-01-25 12:42:48,915 epoch 83 - iter 99/111 - loss 1.07134652 - samples/sec: 25.02 - lr: 0.025000 2021-01-25 12:42:55,515 epoch 83 - iter 110/111 - loss 1.08769437 - samples/sec: 26.67 - lr: 0.025000 2021-01-25 12:42:55,698 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:42:55,700 EPOCH 83 done: loss 1.0788 - lr 0.0250000 2021-01-25 12:42:55,701 BAD EPOCHS (no improvement): 1 2021-01-25 12:43:27,271 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:43:41,555 epoch 84 - iter 11/111 - loss 0.89515281 - samples/sec: 26.34 - lr: 0.025000 2021-01-25 12:43:48,188 epoch 84 - iter 22/111 - loss 1.02019271 - samples/sec: 26.54 - lr: 0.025000 2021-01-25 12:43:55,041 epoch 84 - iter 33/111 - loss 1.05338433 - samples/sec: 25.69 - lr: 0.025000 2021-01-25 12:44:01,054 epoch 84 - iter 44/111 - loss 0.97330595 - samples/sec: 29.29 - lr: 0.025000 2021-01-25 12:44:07,977 epoch 84 - iter 55/111 - loss 1.00204318 - samples/sec: 25.43 - lr: 0.025000 2021-01-25 12:44:14,988 epoch 84 - iter 66/111 - loss 1.00526839 - samples/sec: 25.11 - lr: 0.025000 2021-01-25 12:44:20,739 epoch 84 - iter 77/111 - loss 1.01961758 - samples/sec: 30.63 - lr: 0.025000 2021-01-25 12:44:27,577 epoch 84 - iter 88/111 - loss 1.02706470 - samples/sec: 25.75 - lr: 0.025000 2021-01-25 12:44:34,018 epoch 84 - iter 99/111 - loss 1.02225553 - samples/sec: 27.34 - lr: 0.025000 2021-01-25 12:44:40,057 epoch 84 - iter 110/111 - loss 1.01489627 - samples/sec: 29.16 - lr: 0.025000 2021-01-25 12:44:40,118 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:44:40,120 EPOCH 84 done: loss 1.0322 - lr 0.0250000 2021-01-25 12:44:40,121 BAD EPOCHS (no improvement): 0 2021-01-25 12:45:21,911 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:45:27,848 epoch 85 - iter 11/111 - loss 1.20281910 - samples/sec: 29.80 - lr: 0.025000 2021-01-25 12:45:34,615 epoch 85 - iter 22/111 - loss 1.07895691 - samples/sec: 26.02 - lr: 0.025000 2021-01-25 12:45:41,436 epoch 85 - iter 33/111 - loss 1.05872281 - samples/sec: 25.81 - lr: 0.025000 2021-01-25 12:45:47,412 epoch 85 - iter 44/111 - loss 0.98374504 - samples/sec: 29.47 - lr: 0.025000 2021-01-25 12:45:53,993 epoch 85 - iter 55/111 - loss 1.00330201 - samples/sec: 26.75 - lr: 0.025000 2021-01-25 12:46:00,108 epoch 85 - iter 66/111 - loss 1.00110223 - samples/sec: 28.79 - lr: 0.025000 2021-01-25 12:46:07,952 epoch 85 - iter 77/111 - loss 0.99436903 - samples/sec: 22.45 - lr: 0.025000 2021-01-25 12:46:14,462 epoch 85 - iter 88/111 - loss 1.01154200 - samples/sec: 27.04 - lr: 0.025000 2021-01-25 12:46:20,363 epoch 85 - iter 99/111 - loss 1.01468465 - samples/sec: 29.84 - lr: 0.025000 2021-01-25 12:46:27,104 epoch 85 - iter 110/111 - loss 1.04007535 - samples/sec: 26.11 - lr: 0.025000 2021-01-25 12:46:27,256 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:46:27,258 EPOCH 85 done: loss 1.0516 - lr 0.0250000 2021-01-25 12:46:27,262 BAD EPOCHS (no improvement): 1 2021-01-25 12:47:06,676 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:47:13,059 epoch 86 - iter 11/111 - loss 0.94930763 - samples/sec: 27.71 - lr: 0.025000 2021-01-25 12:47:19,805 epoch 86 - iter 22/111 - loss 0.97583853 - samples/sec: 26.10 - lr: 0.025000 2021-01-25 12:47:26,847 epoch 86 - iter 33/111 - loss 0.96932490 - samples/sec: 25.00 - lr: 0.025000 2021-01-25 12:47:32,771 epoch 86 - iter 44/111 - loss 0.93531939 - samples/sec: 29.73 - lr: 0.025000 2021-01-25 12:47:38,945 epoch 86 - iter 55/111 - loss 0.92774620 - samples/sec: 28.52 - lr: 0.025000 2021-01-25 12:47:46,133 epoch 86 - iter 66/111 - loss 0.94981243 - samples/sec: 24.49 - lr: 0.025000 2021-01-25 12:47:53,488 epoch 86 - iter 77/111 - loss 0.99393307 - samples/sec: 23.94 - lr: 0.025000 2021-01-25 12:48:00,003 epoch 86 - iter 88/111 - loss 1.00942624 - samples/sec: 27.02 - lr: 0.025000 2021-01-25 12:48:06,088 epoch 86 - iter 99/111 - loss 1.06013848 - samples/sec: 28.93 - lr: 0.025000 2021-01-25 12:48:11,186 epoch 86 - iter 110/111 - loss 1.04540570 - samples/sec: 34.54 - lr: 0.025000 2021-01-25 12:48:11,257 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:48:11,258 EPOCH 86 done: loss 1.0363 - lr 0.0250000 2021-01-25 12:48:11,259 BAD EPOCHS (no improvement): 2 2021-01-25 12:48:52,354 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:48:58,412 epoch 87 - iter 11/111 - loss 1.04858258 - samples/sec: 29.19 - lr: 0.025000 2021-01-25 12:49:04,272 epoch 87 - iter 22/111 - loss 1.16057113 - samples/sec: 30.05 - lr: 0.025000 2021-01-25 12:49:11,947 epoch 87 - iter 33/111 - loss 1.21746160 - samples/sec: 22.94 - lr: 0.025000 2021-01-25 12:49:17,837 epoch 87 - iter 44/111 - loss 1.11503506 - samples/sec: 29.89 - lr: 0.025000 2021-01-25 12:49:24,377 epoch 87 - iter 55/111 - loss 1.07564666 - samples/sec: 26.92 - lr: 0.025000 2021-01-25 12:49:31,481 epoch 87 - iter 66/111 - loss 1.08644686 - samples/sec: 24.78 - lr: 0.025000 2021-01-25 12:49:39,195 epoch 87 - iter 77/111 - loss 1.05397260 - samples/sec: 22.82 - lr: 0.025000 2021-01-25 12:49:45,421 epoch 87 - iter 88/111 - loss 1.03656549 - samples/sec: 28.28 - lr: 0.025000 2021-01-25 12:49:52,656 epoch 87 - iter 99/111 - loss 1.03412277 - samples/sec: 24.33 - lr: 0.025000 2021-01-25 12:49:58,463 epoch 87 - iter 110/111 - loss 1.01925475 - samples/sec: 30.32 - lr: 0.025000 2021-01-25 12:49:58,562 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:49:58,564 EPOCH 87 done: loss 1.0181 - lr 0.0250000 2021-01-25 12:49:58,565 BAD EPOCHS (no improvement): 0 2021-01-25 12:50:36,664 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:50:43,288 epoch 88 - iter 11/111 - loss 0.84496240 - samples/sec: 26.71 - lr: 0.025000 2021-01-25 12:50:49,520 epoch 88 - iter 22/111 - loss 1.03849998 - samples/sec: 28.25 - lr: 0.025000 2021-01-25 12:50:54,645 epoch 88 - iter 33/111 - loss 0.98924814 - samples/sec: 34.38 - lr: 0.025000 2021-01-25 12:51:02,978 epoch 88 - iter 44/111 - loss 1.02678045 - samples/sec: 21.13 - lr: 0.025000 2021-01-25 12:51:08,941 epoch 88 - iter 55/111 - loss 1.01480722 - samples/sec: 29.53 - lr: 0.025000 2021-01-25 12:51:16,221 epoch 88 - iter 66/111 - loss 1.02007116 - samples/sec: 24.18 - lr: 0.025000 2021-01-25 12:51:23,112 epoch 88 - iter 77/111 - loss 0.99199804 - samples/sec: 25.55 - lr: 0.025000 2021-01-25 12:51:30,354 epoch 88 - iter 88/111 - loss 1.01447212 - samples/sec: 24.31 - lr: 0.025000 2021-01-25 12:51:36,458 epoch 88 - iter 99/111 - loss 1.00881708 - samples/sec: 28.84 - lr: 0.025000 2021-01-25 12:51:42,730 epoch 88 - iter 110/111 - loss 1.00967457 - samples/sec: 28.07 - lr: 0.025000 2021-01-25 12:51:42,876 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:51:42,877 EPOCH 88 done: loss 1.0161 - lr 0.0250000 2021-01-25 12:51:42,879 BAD EPOCHS (no improvement): 0 2021-01-25 12:52:16,948 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:52:22,652 epoch 89 - iter 11/111 - loss 1.01192850 - samples/sec: 31.02 - lr: 0.025000 2021-01-25 12:52:29,833 epoch 89 - iter 22/111 - loss 0.97130948 - samples/sec: 24.52 - lr: 0.025000 2021-01-25 12:52:36,670 epoch 89 - iter 33/111 - loss 1.03705973 - samples/sec: 25.76 - lr: 0.025000 2021-01-25 12:52:43,108 epoch 89 - iter 44/111 - loss 0.99423095 - samples/sec: 27.35 - lr: 0.025000 2021-01-25 12:52:50,388 epoch 89 - iter 55/111 - loss 1.00434195 - samples/sec: 24.18 - lr: 0.025000 2021-01-25 12:52:57,068 epoch 89 - iter 66/111 - loss 0.98344049 - samples/sec: 26.36 - lr: 0.025000 2021-01-25 12:53:04,125 epoch 89 - iter 77/111 - loss 0.97784265 - samples/sec: 24.95 - lr: 0.025000 2021-01-25 12:53:10,688 epoch 89 - iter 88/111 - loss 0.93191730 - samples/sec: 26.83 - lr: 0.025000 2021-01-25 12:53:15,940 epoch 89 - iter 99/111 - loss 0.94655101 - samples/sec: 33.53 - lr: 0.025000 2021-01-25 12:53:21,888 epoch 89 - iter 110/111 - loss 0.96525188 - samples/sec: 29.60 - lr: 0.025000 2021-01-25 12:53:22,161 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:53:22,162 EPOCH 89 done: loss 0.9703 - lr 0.0250000 2021-01-25 12:53:22,163 BAD EPOCHS (no improvement): 0 2021-01-25 12:54:02,255 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:54:07,327 epoch 90 - iter 11/111 - loss 0.89396156 - samples/sec: 34.91 - lr: 0.025000 2021-01-25 12:54:12,901 epoch 90 - iter 22/111 - loss 0.94856649 - samples/sec: 31.58 - lr: 0.025000 2021-01-25 12:54:19,321 epoch 90 - iter 33/111 - loss 0.85900531 - samples/sec: 27.43 - lr: 0.025000 2021-01-25 12:54:26,733 epoch 90 - iter 44/111 - loss 0.91310688 - samples/sec: 23.75 - lr: 0.025000 2021-01-25 12:54:33,402 epoch 90 - iter 55/111 - loss 0.91476873 - samples/sec: 26.40 - lr: 0.025000 2021-01-25 12:54:41,020 epoch 90 - iter 66/111 - loss 0.94598620 - samples/sec: 23.11 - lr: 0.025000 2021-01-25 12:54:47,872 epoch 90 - iter 77/111 - loss 0.95036292 - samples/sec: 25.70 - lr: 0.025000 2021-01-25 12:54:54,259 epoch 90 - iter 88/111 - loss 0.98239688 - samples/sec: 27.57 - lr: 0.025000 2021-01-25 12:55:00,875 epoch 90 - iter 99/111 - loss 0.97952184 - samples/sec: 26.61 - lr: 0.025000 2021-01-25 12:55:07,433 epoch 90 - iter 110/111 - loss 0.97337086 - samples/sec: 26.85 - lr: 0.025000 2021-01-25 12:55:07,512 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:55:07,514 EPOCH 90 done: loss 0.9669 - lr 0.0250000 2021-01-25 12:55:07,514 BAD EPOCHS (no improvement): 0 2021-01-25 12:55:48,026 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:55:53,469 epoch 91 - iter 11/111 - loss 1.12259874 - samples/sec: 32.51 - lr: 0.025000 2021-01-25 12:55:59,764 epoch 91 - iter 22/111 - loss 1.01117295 - samples/sec: 27.97 - lr: 0.025000 2021-01-25 12:56:06,343 epoch 91 - iter 33/111 - loss 0.94316753 - samples/sec: 26.76 - lr: 0.025000 2021-01-25 12:56:11,776 epoch 91 - iter 44/111 - loss 0.91113899 - samples/sec: 32.41 - lr: 0.025000 2021-01-25 12:56:19,114 epoch 91 - iter 55/111 - loss 0.90855847 - samples/sec: 23.99 - lr: 0.025000 2021-01-25 12:56:26,490 epoch 91 - iter 66/111 - loss 0.87885021 - samples/sec: 23.87 - lr: 0.025000 2021-01-25 12:56:32,911 epoch 91 - iter 77/111 - loss 0.89054928 - samples/sec: 27.42 - lr: 0.025000 2021-01-25 12:56:40,159 epoch 91 - iter 88/111 - loss 0.91477145 - samples/sec: 24.29 - lr: 0.025000 2021-01-25 12:56:46,126 epoch 91 - iter 99/111 - loss 0.91054353 - samples/sec: 29.50 - lr: 0.025000 2021-01-25 12:56:52,046 epoch 91 - iter 110/111 - loss 0.89999675 - samples/sec: 29.74 - lr: 0.025000 2021-01-25 12:56:52,107 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:56:52,108 EPOCH 91 done: loss 0.8933 - lr 0.0250000 2021-01-25 12:56:52,109 BAD EPOCHS (no improvement): 0 2021-01-25 12:57:24,370 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:57:31,155 epoch 92 - iter 11/111 - loss 0.91195044 - samples/sec: 26.05 - lr: 0.025000 2021-01-25 12:57:39,276 epoch 92 - iter 22/111 - loss 0.92761843 - samples/sec: 28.09 - lr: 0.025000 2021-01-25 12:57:46,538 epoch 92 - iter 33/111 - loss 1.01224054 - samples/sec: 24.25 - lr: 0.025000 2021-01-25 12:57:53,075 epoch 92 - iter 44/111 - loss 0.93886627 - samples/sec: 26.93 - lr: 0.025000 2021-01-25 12:57:59,335 epoch 92 - iter 55/111 - loss 0.96555115 - samples/sec: 28.13 - lr: 0.025000 2021-01-25 12:58:05,990 epoch 92 - iter 66/111 - loss 0.93257322 - samples/sec: 26.45 - lr: 0.025000 2021-01-25 12:58:13,810 epoch 92 - iter 77/111 - loss 0.96046350 - samples/sec: 22.51 - lr: 0.025000 2021-01-25 12:58:19,682 epoch 92 - iter 88/111 - loss 0.97479221 - samples/sec: 29.99 - lr: 0.025000 2021-01-25 12:58:25,353 epoch 92 - iter 99/111 - loss 0.95190705 - samples/sec: 31.05 - lr: 0.025000 2021-01-25 12:58:31,006 epoch 92 - iter 110/111 - loss 0.94280147 - samples/sec: 31.14 - lr: 0.025000 2021-01-25 12:58:31,203 ---------------------------------------------------------------------------------------------------- 2021-01-25 12:58:31,204 EPOCH 92 done: loss 0.9345 - lr 0.0250000 2021-01-25 12:58:31,206 BAD EPOCHS (no improvement): 1 2021-01-25 12:59:04,472 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 12:59:10,611 epoch 93 - iter 11/111 - loss 0.76319311 - samples/sec: 28.91 - lr: 0.025000 2021-01-25 12:59:21,507 epoch 93 - iter 22/111 - loss 0.80824860 - samples/sec: 21.30 - lr: 0.025000 2021-01-25 12:59:28,277 epoch 93 - iter 33/111 - loss 0.87222931 - samples/sec: 26.01 - lr: 0.025000 2021-01-25 12:59:36,146 epoch 93 - iter 44/111 - loss 0.93845968 - samples/sec: 22.37 - lr: 0.025000 2021-01-25 12:59:43,110 epoch 93 - iter 55/111 - loss 0.97216012 - samples/sec: 25.28 - lr: 0.025000 2021-01-25 12:59:49,391 epoch 93 - iter 66/111 - loss 0.94475628 - samples/sec: 28.03 - lr: 0.025000 2021-01-25 12:59:55,729 epoch 93 - iter 77/111 - loss 0.93116077 - samples/sec: 27.78 - lr: 0.025000 2021-01-25 13:00:00,918 epoch 93 - iter 88/111 - loss 0.92936302 - samples/sec: 33.94 - lr: 0.025000 2021-01-25 13:00:06,547 epoch 93 - iter 99/111 - loss 0.94155475 - samples/sec: 31.28 - lr: 0.025000 2021-01-25 13:00:12,429 epoch 93 - iter 110/111 - loss 0.94386096 - samples/sec: 29.93 - lr: 0.025000 2021-01-25 13:00:12,475 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:00:12,476 EPOCH 93 done: loss 0.9385 - lr 0.0250000 2021-01-25 13:00:12,478 BAD EPOCHS (no improvement): 2 2021-01-25 13:00:53,789 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:01:00,272 epoch 94 - iter 11/111 - loss 0.80060596 - samples/sec: 27.33 - lr: 0.025000 2021-01-25 13:01:05,867 epoch 94 - iter 22/111 - loss 0.76706426 - samples/sec: 31.47 - lr: 0.025000 2021-01-25 13:01:11,599 epoch 94 - iter 33/111 - loss 0.81543036 - samples/sec: 30.72 - lr: 0.025000 2021-01-25 13:01:18,936 epoch 94 - iter 44/111 - loss 0.88398113 - samples/sec: 24.00 - lr: 0.025000 2021-01-25 13:01:25,885 epoch 94 - iter 55/111 - loss 0.87736669 - samples/sec: 25.34 - lr: 0.025000 2021-01-25 13:01:33,393 epoch 94 - iter 66/111 - loss 0.84505906 - samples/sec: 23.45 - lr: 0.025000 2021-01-25 13:01:40,200 epoch 94 - iter 77/111 - loss 0.84360749 - samples/sec: 25.86 - lr: 0.025000 2021-01-25 13:01:45,652 epoch 94 - iter 88/111 - loss 0.86948546 - samples/sec: 32.30 - lr: 0.025000 2021-01-25 13:01:51,536 epoch 94 - iter 99/111 - loss 0.87685072 - samples/sec: 29.92 - lr: 0.025000 2021-01-25 13:01:57,924 epoch 94 - iter 110/111 - loss 0.86960360 - samples/sec: 27.56 - lr: 0.025000 2021-01-25 13:01:57,988 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:01:57,989 EPOCH 94 done: loss 0.8618 - lr 0.0250000 2021-01-25 13:01:57,990 BAD EPOCHS (no improvement): 0 2021-01-25 13:02:36,161 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:02:42,494 epoch 95 - iter 11/111 - loss 1.05845071 - samples/sec: 27.92 - lr: 0.025000 2021-01-25 13:02:48,386 epoch 95 - iter 22/111 - loss 0.92435220 - samples/sec: 29.88 - lr: 0.025000 2021-01-25 13:02:55,476 epoch 95 - iter 33/111 - loss 1.01854410 - samples/sec: 24.84 - lr: 0.025000 2021-01-25 13:03:02,522 epoch 95 - iter 44/111 - loss 0.95836734 - samples/sec: 24.99 - lr: 0.025000 2021-01-25 13:03:10,668 epoch 95 - iter 55/111 - loss 0.92500736 - samples/sec: 21.61 - lr: 0.025000 2021-01-25 13:03:17,699 epoch 95 - iter 66/111 - loss 0.89741766 - samples/sec: 25.04 - lr: 0.025000 2021-01-25 13:03:23,281 epoch 95 - iter 77/111 - loss 0.92225909 - samples/sec: 31.55 - lr: 0.025000 2021-01-25 13:03:29,258 epoch 95 - iter 88/111 - loss 0.93587268 - samples/sec: 29.46 - lr: 0.025000 2021-01-25 13:03:34,565 epoch 95 - iter 99/111 - loss 0.90957047 - samples/sec: 33.18 - lr: 0.025000 2021-01-25 13:03:40,620 epoch 95 - iter 110/111 - loss 0.89800110 - samples/sec: 29.08 - lr: 0.025000 2021-01-25 13:03:40,720 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:03:40,722 EPOCH 95 done: loss 0.9710 - lr 0.0250000 2021-01-25 13:03:40,723 BAD EPOCHS (no improvement): 1 2021-01-25 13:04:21,198 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:04:27,384 epoch 96 - iter 11/111 - loss 0.80467089 - samples/sec: 28.58 - lr: 0.025000 2021-01-25 13:04:33,387 epoch 96 - iter 22/111 - loss 0.86729766 - samples/sec: 29.33 - lr: 0.025000 2021-01-25 13:04:40,351 epoch 96 - iter 33/111 - loss 0.88488515 - samples/sec: 25.28 - lr: 0.025000 2021-01-25 13:04:46,698 epoch 96 - iter 44/111 - loss 0.88125065 - samples/sec: 27.74 - lr: 0.025000 2021-01-25 13:04:53,369 epoch 96 - iter 55/111 - loss 0.86685654 - samples/sec: 26.39 - lr: 0.025000 2021-01-25 13:04:59,839 epoch 96 - iter 66/111 - loss 0.82779145 - samples/sec: 27.21 - lr: 0.025000 2021-01-25 13:05:05,934 epoch 96 - iter 77/111 - loss 0.83271972 - samples/sec: 28.88 - lr: 0.025000 2021-01-25 13:05:12,058 epoch 96 - iter 88/111 - loss 0.84438797 - samples/sec: 28.75 - lr: 0.025000 2021-01-25 13:05:18,159 epoch 96 - iter 99/111 - loss 0.83008090 - samples/sec: 28.86 - lr: 0.025000 2021-01-25 13:05:25,296 epoch 96 - iter 110/111 - loss 0.84527026 - samples/sec: 24.67 - lr: 0.025000 2021-01-25 13:05:25,418 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:05:25,419 EPOCH 96 done: loss 0.8442 - lr 0.0250000 2021-01-25 13:05:25,420 BAD EPOCHS (no improvement): 0 2021-01-25 13:06:04,455 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:06:10,219 epoch 97 - iter 11/111 - loss 0.83351330 - samples/sec: 30.70 - lr: 0.025000 2021-01-25 13:06:15,321 epoch 97 - iter 22/111 - loss 0.83220994 - samples/sec: 34.51 - lr: 0.025000 2021-01-25 13:06:23,308 epoch 97 - iter 33/111 - loss 0.94219524 - samples/sec: 22.04 - lr: 0.025000 2021-01-25 13:06:29,764 epoch 97 - iter 44/111 - loss 0.90053729 - samples/sec: 27.27 - lr: 0.025000 2021-01-25 13:06:36,276 epoch 97 - iter 55/111 - loss 0.87480188 - samples/sec: 27.04 - lr: 0.025000 2021-01-25 13:06:42,959 epoch 97 - iter 66/111 - loss 0.88492908 - samples/sec: 26.34 - lr: 0.025000 2021-01-25 13:06:49,641 epoch 97 - iter 77/111 - loss 0.88339231 - samples/sec: 26.35 - lr: 0.025000 2021-01-25 13:06:56,023 epoch 97 - iter 88/111 - loss 0.86654037 - samples/sec: 27.59 - lr: 0.025000 2021-01-25 13:07:03,367 epoch 97 - iter 99/111 - loss 0.87654721 - samples/sec: 23.97 - lr: 0.025000 2021-01-25 13:07:09,728 epoch 97 - iter 110/111 - loss 0.85163859 - samples/sec: 27.68 - lr: 0.025000 2021-01-25 13:07:09,844 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:07:09,846 EPOCH 97 done: loss 0.8613 - lr 0.0250000 2021-01-25 13:07:09,847 BAD EPOCHS (no improvement): 1 2021-01-25 13:07:50,879 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:07:58,252 epoch 98 - iter 11/111 - loss 0.82806001 - samples/sec: 23.98 - lr: 0.025000 2021-01-25 13:08:05,020 epoch 98 - iter 22/111 - loss 0.86607333 - samples/sec: 26.02 - lr: 0.025000 2021-01-25 13:08:11,580 epoch 98 - iter 33/111 - loss 0.94919064 - samples/sec: 26.84 - lr: 0.025000 2021-01-25 13:08:19,165 epoch 98 - iter 44/111 - loss 0.92271694 - samples/sec: 23.22 - lr: 0.025000 2021-01-25 13:08:25,986 epoch 98 - iter 55/111 - loss 0.89980591 - samples/sec: 25.81 - lr: 0.025000 2021-01-25 13:08:32,172 epoch 98 - iter 66/111 - loss 0.88184847 - samples/sec: 28.46 - lr: 0.025000 2021-01-25 13:08:38,899 epoch 98 - iter 77/111 - loss 0.87890078 - samples/sec: 26.17 - lr: 0.025000 2021-01-25 13:08:46,184 epoch 98 - iter 88/111 - loss 0.87076851 - samples/sec: 24.17 - lr: 0.025000 2021-01-25 13:08:51,861 epoch 98 - iter 99/111 - loss 0.87326085 - samples/sec: 31.02 - lr: 0.025000 2021-01-25 13:08:57,961 epoch 98 - iter 110/111 - loss 0.86002073 - samples/sec: 28.86 - lr: 0.025000 2021-01-25 13:08:58,063 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:08:58,064 EPOCH 98 done: loss 0.8523 - lr 0.0250000 2021-01-25 13:08:58,066 BAD EPOCHS (no improvement): 2 2021-01-25 13:09:34,666 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:09:41,057 epoch 99 - iter 11/111 - loss 0.80923878 - samples/sec: 27.66 - lr: 0.025000 2021-01-25 13:09:47,900 epoch 99 - iter 22/111 - loss 0.82019573 - samples/sec: 25.73 - lr: 0.025000 2021-01-25 13:09:55,133 epoch 99 - iter 33/111 - loss 0.87697485 - samples/sec: 24.34 - lr: 0.025000 2021-01-25 13:10:01,103 epoch 99 - iter 44/111 - loss 0.84858296 - samples/sec: 29.49 - lr: 0.025000 2021-01-25 13:10:08,094 epoch 99 - iter 55/111 - loss 0.84927820 - samples/sec: 25.18 - lr: 0.025000 2021-01-25 13:10:13,638 epoch 99 - iter 66/111 - loss 0.83019196 - samples/sec: 31.76 - lr: 0.025000 2021-01-25 13:10:19,677 epoch 99 - iter 77/111 - loss 0.83874743 - samples/sec: 29.16 - lr: 0.025000 2021-01-25 13:10:26,329 epoch 99 - iter 88/111 - loss 0.82511558 - samples/sec: 26.47 - lr: 0.025000 2021-01-25 13:10:33,761 epoch 99 - iter 99/111 - loss 0.83035890 - samples/sec: 23.69 - lr: 0.025000 2021-01-25 13:10:39,933 epoch 99 - iter 110/111 - loss 0.81092254 - samples/sec: 28.53 - lr: 0.025000 2021-01-25 13:10:40,088 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:10:40,089 EPOCH 99 done: loss 0.8109 - lr 0.0250000 2021-01-25 13:10:40,091 BAD EPOCHS (no improvement): 0 2021-01-25 13:11:18,404 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:11:24,829 epoch 100 - iter 11/111 - loss 1.23233911 - samples/sec: 27.53 - lr: 0.025000 2021-01-25 13:11:30,887 epoch 100 - iter 22/111 - loss 0.95968962 - samples/sec: 29.06 - lr: 0.025000 2021-01-25 13:11:37,798 epoch 100 - iter 33/111 - loss 0.90293363 - samples/sec: 25.48 - lr: 0.025000 2021-01-25 13:11:44,710 epoch 100 - iter 44/111 - loss 0.87274729 - samples/sec: 25.47 - lr: 0.025000 2021-01-25 13:11:51,196 epoch 100 - iter 55/111 - loss 0.91466900 - samples/sec: 27.15 - lr: 0.025000 2021-01-25 13:11:57,685 epoch 100 - iter 66/111 - loss 0.92499208 - samples/sec: 27.13 - lr: 0.025000 2021-01-25 13:12:04,797 epoch 100 - iter 77/111 - loss 0.91667200 - samples/sec: 24.76 - lr: 0.025000 2021-01-25 13:12:12,584 epoch 100 - iter 88/111 - loss 0.93384858 - samples/sec: 22.61 - lr: 0.025000 2021-01-25 13:12:18,552 epoch 100 - iter 99/111 - loss 0.91917092 - samples/sec: 29.50 - lr: 0.025000 2021-01-25 13:12:24,661 epoch 100 - iter 110/111 - loss 0.90454569 - samples/sec: 28.82 - lr: 0.025000 2021-01-25 13:12:24,751 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:12:24,753 EPOCH 100 done: loss 0.8978 - lr 0.0250000 2021-01-25 13:12:24,754 BAD EPOCHS (no improvement): 1 2021-01-25 13:13:01,082 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:13:06,431 epoch 101 - iter 11/111 - loss 0.72068848 - samples/sec: 33.08 - lr: 0.025000 2021-01-25 13:13:13,113 epoch 101 - iter 22/111 - loss 0.76602584 - samples/sec: 26.36 - lr: 0.025000 2021-01-25 13:13:20,337 epoch 101 - iter 33/111 - loss 0.78204447 - samples/sec: 24.37 - lr: 0.025000 2021-01-25 13:13:27,132 epoch 101 - iter 44/111 - loss 0.84667853 - samples/sec: 25.91 - lr: 0.025000 2021-01-25 13:13:35,455 epoch 101 - iter 55/111 - loss 0.87525084 - samples/sec: 21.15 - lr: 0.025000 2021-01-25 13:13:42,088 epoch 101 - iter 66/111 - loss 0.92288906 - samples/sec: 26.55 - lr: 0.025000 2021-01-25 13:13:49,006 epoch 101 - iter 77/111 - loss 0.87778378 - samples/sec: 25.45 - lr: 0.025000 2021-01-25 13:13:55,163 epoch 101 - iter 88/111 - loss 0.87100984 - samples/sec: 28.60 - lr: 0.025000 2021-01-25 13:14:00,122 epoch 101 - iter 99/111 - loss 0.84793395 - samples/sec: 35.51 - lr: 0.025000 2021-01-25 13:14:05,896 epoch 101 - iter 110/111 - loss 0.83776233 - samples/sec: 30.49 - lr: 0.025000 2021-01-25 13:14:06,030 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:14:06,032 EPOCH 101 done: loss 0.8584 - lr 0.0250000 2021-01-25 13:14:06,034 BAD EPOCHS (no improvement): 2 2021-01-25 13:14:41,521 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:14:47,136 epoch 102 - iter 11/111 - loss 0.73878605 - samples/sec: 31.50 - lr: 0.025000 2021-01-25 13:14:52,978 epoch 102 - iter 22/111 - loss 0.79044485 - samples/sec: 30.14 - lr: 0.025000 2021-01-25 13:14:59,353 epoch 102 - iter 33/111 - loss 0.76262681 - samples/sec: 27.62 - lr: 0.025000 2021-01-25 13:15:06,989 epoch 102 - iter 44/111 - loss 0.79057171 - samples/sec: 23.06 - lr: 0.025000 2021-01-25 13:15:13,764 epoch 102 - iter 55/111 - loss 0.77894565 - samples/sec: 25.98 - lr: 0.025000 2021-01-25 13:15:19,236 epoch 102 - iter 66/111 - loss 0.78738895 - samples/sec: 32.18 - lr: 0.025000 2021-01-25 13:15:26,887 epoch 102 - iter 77/111 - loss 0.78415570 - samples/sec: 23.01 - lr: 0.025000 2021-01-25 13:15:34,970 epoch 102 - iter 88/111 - loss 0.77433868 - samples/sec: 21.78 - lr: 0.025000 2021-01-25 13:15:41,058 epoch 102 - iter 99/111 - loss 0.79454992 - samples/sec: 28.94 - lr: 0.025000 2021-01-25 13:15:46,522 epoch 102 - iter 110/111 - loss 0.78881507 - samples/sec: 32.22 - lr: 0.025000 2021-01-25 13:15:46,820 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:15:46,821 EPOCH 102 done: loss 0.8152 - lr 0.0250000 2021-01-25 13:15:46,823 BAD EPOCHS (no improvement): 3 2021-01-25 13:16:19,467 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:16:25,278 epoch 103 - iter 11/111 - loss 0.74123833 - samples/sec: 30.44 - lr: 0.025000 2021-01-25 13:16:32,044 epoch 103 - iter 22/111 - loss 0.69190160 - samples/sec: 26.02 - lr: 0.025000 2021-01-25 13:16:38,365 epoch 103 - iter 33/111 - loss 0.72119454 - samples/sec: 27.85 - lr: 0.025000 2021-01-25 13:16:45,331 epoch 103 - iter 44/111 - loss 0.72245607 - samples/sec: 25.28 - lr: 0.025000 2021-01-25 13:16:51,169 epoch 103 - iter 55/111 - loss 0.74065522 - samples/sec: 30.16 - lr: 0.025000 2021-01-25 13:16:59,692 epoch 103 - iter 66/111 - loss 0.76567466 - samples/sec: 20.66 - lr: 0.025000 2021-01-25 13:17:06,089 epoch 103 - iter 77/111 - loss 0.77390303 - samples/sec: 27.52 - lr: 0.025000 2021-01-25 13:17:13,786 epoch 103 - iter 88/111 - loss 0.78259146 - samples/sec: 22.87 - lr: 0.025000 2021-01-25 13:17:18,932 epoch 103 - iter 99/111 - loss 0.77794933 - samples/sec: 34.22 - lr: 0.025000 2021-01-25 13:17:25,464 epoch 103 - iter 110/111 - loss 0.78266095 - samples/sec: 26.96 - lr: 0.025000 2021-01-25 13:17:25,570 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:17:25,571 EPOCH 103 done: loss 0.7756 - lr 0.0250000 2021-01-25 13:17:25,573 BAD EPOCHS (no improvement): 0 2021-01-25 13:18:05,856 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:18:12,089 epoch 104 - iter 11/111 - loss 0.84889564 - samples/sec: 28.37 - lr: 0.025000 2021-01-25 13:18:18,905 epoch 104 - iter 22/111 - loss 0.88635410 - samples/sec: 25.83 - lr: 0.025000 2021-01-25 13:18:24,719 epoch 104 - iter 33/111 - loss 0.91701587 - samples/sec: 30.28 - lr: 0.025000 2021-01-25 13:18:32,156 epoch 104 - iter 44/111 - loss 0.87236986 - samples/sec: 23.67 - lr: 0.025000 2021-01-25 13:18:38,553 epoch 104 - iter 55/111 - loss 0.84002334 - samples/sec: 27.52 - lr: 0.025000 2021-01-25 13:18:46,313 epoch 104 - iter 66/111 - loss 0.83812284 - samples/sec: 22.69 - lr: 0.025000 2021-01-25 13:18:53,078 epoch 104 - iter 77/111 - loss 0.82564598 - samples/sec: 26.03 - lr: 0.025000 2021-01-25 13:19:00,228 epoch 104 - iter 88/111 - loss 0.81723271 - samples/sec: 24.62 - lr: 0.025000 2021-01-25 13:19:05,094 epoch 104 - iter 99/111 - loss 0.80680975 - samples/sec: 36.19 - lr: 0.025000 2021-01-25 13:19:10,903 epoch 104 - iter 110/111 - loss 0.80274126 - samples/sec: 30.31 - lr: 0.025000 2021-01-25 13:19:10,982 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:19:10,984 EPOCH 104 done: loss 0.8121 - lr 0.0250000 2021-01-25 13:19:10,985 BAD EPOCHS (no improvement): 1 2021-01-25 13:19:46,582 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:19:52,522 epoch 105 - iter 11/111 - loss 0.75238445 - samples/sec: 29.78 - lr: 0.025000 2021-01-25 13:19:58,366 epoch 105 - iter 22/111 - loss 0.71207552 - samples/sec: 30.13 - lr: 0.025000 2021-01-25 13:20:04,656 epoch 105 - iter 33/111 - loss 0.68897040 - samples/sec: 27.99 - lr: 0.025000 2021-01-25 13:20:12,444 epoch 105 - iter 44/111 - loss 0.70233877 - samples/sec: 22.61 - lr: 0.025000 2021-01-25 13:20:18,361 epoch 105 - iter 55/111 - loss 0.70961836 - samples/sec: 29.76 - lr: 0.025000 2021-01-25 13:20:24,905 epoch 105 - iter 66/111 - loss 0.70972574 - samples/sec: 26.91 - lr: 0.025000 2021-01-25 13:20:31,995 epoch 105 - iter 77/111 - loss 0.72542387 - samples/sec: 24.83 - lr: 0.025000 2021-01-25 13:20:39,467 epoch 105 - iter 88/111 - loss 0.76028742 - samples/sec: 23.56 - lr: 0.025000 2021-01-25 13:20:45,538 epoch 105 - iter 99/111 - loss 0.75797118 - samples/sec: 29.00 - lr: 0.025000 2021-01-25 13:20:51,481 epoch 105 - iter 110/111 - loss 0.74994436 - samples/sec: 29.63 - lr: 0.025000 2021-01-25 13:20:51,626 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:20:51,627 EPOCH 105 done: loss 0.7444 - lr 0.0250000 2021-01-25 13:20:51,628 BAD EPOCHS (no improvement): 0 2021-01-25 13:21:27,334 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:21:32,837 epoch 106 - iter 11/111 - loss 0.64356654 - samples/sec: 32.16 - lr: 0.025000 2021-01-25 13:21:38,720 epoch 106 - iter 22/111 - loss 0.68415704 - samples/sec: 29.93 - lr: 0.025000 2021-01-25 13:21:45,922 epoch 106 - iter 33/111 - loss 0.69880616 - samples/sec: 24.45 - lr: 0.025000 2021-01-25 13:21:52,509 epoch 106 - iter 44/111 - loss 0.70173935 - samples/sec: 26.74 - lr: 0.025000 2021-01-25 13:22:00,333 epoch 106 - iter 55/111 - loss 0.70095117 - samples/sec: 22.50 - lr: 0.025000 2021-01-25 13:22:07,637 epoch 106 - iter 66/111 - loss 0.72124311 - samples/sec: 24.10 - lr: 0.025000 2021-01-25 13:22:14,853 epoch 106 - iter 77/111 - loss 0.72070282 - samples/sec: 24.40 - lr: 0.025000 2021-01-25 13:22:20,696 epoch 106 - iter 88/111 - loss 0.71518310 - samples/sec: 30.13 - lr: 0.025000 2021-01-25 13:22:26,923 epoch 106 - iter 99/111 - loss 0.73347111 - samples/sec: 28.27 - lr: 0.025000 2021-01-25 13:22:32,228 epoch 106 - iter 110/111 - loss 0.72210925 - samples/sec: 33.19 - lr: 0.025000 2021-01-25 13:22:32,290 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:22:32,291 EPOCH 106 done: loss 0.7157 - lr 0.0250000 2021-01-25 13:22:32,294 BAD EPOCHS (no improvement): 0 2021-01-25 13:23:11,705 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:23:16,568 epoch 107 - iter 11/111 - loss 0.78385908 - samples/sec: 36.41 - lr: 0.025000 2021-01-25 13:23:23,654 epoch 107 - iter 22/111 - loss 0.80658507 - samples/sec: 24.85 - lr: 0.025000 2021-01-25 13:23:30,244 epoch 107 - iter 33/111 - loss 0.75227328 - samples/sec: 26.72 - lr: 0.025000 2021-01-25 13:23:37,904 epoch 107 - iter 44/111 - loss 0.73590938 - samples/sec: 22.99 - lr: 0.025000 2021-01-25 13:23:44,051 epoch 107 - iter 55/111 - loss 0.70672992 - samples/sec: 28.64 - lr: 0.025000 2021-01-25 13:23:51,633 epoch 107 - iter 66/111 - loss 0.71791957 - samples/sec: 23.22 - lr: 0.025000 2021-01-25 13:23:57,650 epoch 107 - iter 77/111 - loss 0.72108388 - samples/sec: 29.26 - lr: 0.025000 2021-01-25 13:24:03,052 epoch 107 - iter 88/111 - loss 0.73094791 - samples/sec: 32.59 - lr: 0.025000 2021-01-25 13:24:08,745 epoch 107 - iter 99/111 - loss 0.76851634 - samples/sec: 30.93 - lr: 0.025000 2021-01-25 13:24:14,615 epoch 107 - iter 110/111 - loss 0.78710466 - samples/sec: 30.00 - lr: 0.025000 2021-01-25 13:24:14,702 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:24:14,703 EPOCH 107 done: loss 0.7800 - lr 0.0250000 2021-01-25 13:24:14,704 BAD EPOCHS (no improvement): 1 2021-01-25 13:24:46,476 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:24:52,133 epoch 108 - iter 11/111 - loss 0.80362939 - samples/sec: 31.28 - lr: 0.025000 2021-01-25 13:24:58,875 epoch 108 - iter 22/111 - loss 0.80106070 - samples/sec: 26.11 - lr: 0.025000 2021-01-25 13:25:06,231 epoch 108 - iter 33/111 - loss 0.74569149 - samples/sec: 23.94 - lr: 0.025000 2021-01-25 13:25:12,633 epoch 108 - iter 44/111 - loss 0.75380532 - samples/sec: 27.50 - lr: 0.025000 2021-01-25 13:25:19,871 epoch 108 - iter 55/111 - loss 0.76033305 - samples/sec: 24.33 - lr: 0.025000 2021-01-25 13:25:25,753 epoch 108 - iter 66/111 - loss 0.75026805 - samples/sec: 29.93 - lr: 0.025000 2021-01-25 13:25:32,584 epoch 108 - iter 77/111 - loss 0.74224959 - samples/sec: 25.77 - lr: 0.025000 2021-01-25 13:25:39,388 epoch 108 - iter 88/111 - loss 0.75863961 - samples/sec: 25.88 - lr: 0.025000 2021-01-25 13:25:45,013 epoch 108 - iter 99/111 - loss 0.75311675 - samples/sec: 31.30 - lr: 0.025000 2021-01-25 13:25:51,156 epoch 108 - iter 110/111 - loss 0.76198733 - samples/sec: 28.66 - lr: 0.025000 2021-01-25 13:25:51,227 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:25:51,228 EPOCH 108 done: loss 0.7556 - lr 0.0250000 2021-01-25 13:25:51,230 BAD EPOCHS (no improvement): 2 2021-01-25 13:26:24,772 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:26:30,196 epoch 109 - iter 11/111 - loss 0.67946928 - samples/sec: 32.63 - lr: 0.025000 2021-01-25 13:26:36,569 epoch 109 - iter 22/111 - loss 0.63537086 - samples/sec: 27.62 - lr: 0.025000 2021-01-25 13:26:44,345 epoch 109 - iter 33/111 - loss 0.67198067 - samples/sec: 22.64 - lr: 0.025000 2021-01-25 13:26:50,602 epoch 109 - iter 44/111 - loss 0.66798389 - samples/sec: 28.14 - lr: 0.025000 2021-01-25 13:26:57,754 epoch 109 - iter 55/111 - loss 0.66822040 - samples/sec: 24.62 - lr: 0.025000 2021-01-25 13:27:04,891 epoch 109 - iter 66/111 - loss 0.70470827 - samples/sec: 24.67 - lr: 0.025000 2021-01-25 13:27:11,590 epoch 109 - iter 77/111 - loss 0.70594568 - samples/sec: 26.29 - lr: 0.025000 2021-01-25 13:27:16,862 epoch 109 - iter 88/111 - loss 0.71501861 - samples/sec: 33.40 - lr: 0.025000 2021-01-25 13:27:23,079 epoch 109 - iter 99/111 - loss 0.70206023 - samples/sec: 28.32 - lr: 0.025000 2021-01-25 13:27:28,714 epoch 109 - iter 110/111 - loss 0.70352085 - samples/sec: 31.25 - lr: 0.025000 2021-01-25 13:27:28,904 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:27:28,906 EPOCH 109 done: loss 0.7028 - lr 0.0250000 2021-01-25 13:27:28,907 BAD EPOCHS (no improvement): 0 2021-01-25 13:28:00,866 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:28:06,656 epoch 110 - iter 11/111 - loss 0.71369973 - samples/sec: 30.57 - lr: 0.025000 2021-01-25 13:28:13,156 epoch 110 - iter 22/111 - loss 0.66351167 - samples/sec: 27.08 - lr: 0.025000 2021-01-25 13:28:20,549 epoch 110 - iter 33/111 - loss 0.81643700 - samples/sec: 23.82 - lr: 0.025000 2021-01-25 13:28:27,384 epoch 110 - iter 44/111 - loss 0.74415305 - samples/sec: 25.77 - lr: 0.025000 2021-01-25 13:28:33,510 epoch 110 - iter 55/111 - loss 0.67818336 - samples/sec: 28.74 - lr: 0.025000 2021-01-25 13:28:39,740 epoch 110 - iter 66/111 - loss 0.66982269 - samples/sec: 28.26 - lr: 0.025000 2021-01-25 13:28:47,270 epoch 110 - iter 77/111 - loss 0.68938970 - samples/sec: 23.38 - lr: 0.025000 2021-01-25 13:28:53,242 epoch 110 - iter 88/111 - loss 0.69433806 - samples/sec: 29.49 - lr: 0.025000 2021-01-25 13:28:59,377 epoch 110 - iter 99/111 - loss 0.70094008 - samples/sec: 28.70 - lr: 0.025000 2021-01-25 13:29:04,788 epoch 110 - iter 110/111 - loss 0.69715567 - samples/sec: 32.54 - lr: 0.025000 2021-01-25 13:29:04,865 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:29:04,867 EPOCH 110 done: loss 0.6911 - lr 0.0250000 2021-01-25 13:29:04,868 BAD EPOCHS (no improvement): 0 2021-01-25 13:29:45,483 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:29:52,653 epoch 111 - iter 11/111 - loss 0.63257407 - samples/sec: 24.64 - lr: 0.025000 2021-01-25 13:29:58,844 epoch 111 - iter 22/111 - loss 0.65866832 - samples/sec: 28.44 - lr: 0.025000 2021-01-25 13:30:05,551 epoch 111 - iter 33/111 - loss 0.69570449 - samples/sec: 26.25 - lr: 0.025000 2021-01-25 13:30:12,712 epoch 111 - iter 44/111 - loss 0.70284502 - samples/sec: 24.58 - lr: 0.025000 2021-01-25 13:30:18,503 epoch 111 - iter 55/111 - loss 0.67599891 - samples/sec: 30.40 - lr: 0.025000 2021-01-25 13:30:25,430 epoch 111 - iter 66/111 - loss 0.70085383 - samples/sec: 25.42 - lr: 0.025000 2021-01-25 13:30:33,174 epoch 111 - iter 77/111 - loss 0.69126879 - samples/sec: 22.74 - lr: 0.025000 2021-01-25 13:30:38,533 epoch 111 - iter 88/111 - loss 0.71777350 - samples/sec: 32.86 - lr: 0.025000 2021-01-25 13:30:44,432 epoch 111 - iter 99/111 - loss 0.71716955 - samples/sec: 29.84 - lr: 0.025000 2021-01-25 13:30:50,481 epoch 111 - iter 110/111 - loss 0.71869382 - samples/sec: 29.11 - lr: 0.025000 2021-01-25 13:30:50,570 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:30:50,572 EPOCH 111 done: loss 0.7126 - lr 0.0250000 2021-01-25 13:30:50,573 BAD EPOCHS (no improvement): 1 2021-01-25 13:31:29,114 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:31:36,137 epoch 112 - iter 11/111 - loss 0.75990707 - samples/sec: 25.17 - lr: 0.025000 2021-01-25 13:31:42,555 epoch 112 - iter 22/111 - loss 0.65124447 - samples/sec: 27.43 - lr: 0.025000 2021-01-25 13:31:48,907 epoch 112 - iter 33/111 - loss 0.67350851 - samples/sec: 27.72 - lr: 0.025000 2021-01-25 13:31:55,671 epoch 112 - iter 44/111 - loss 0.72417201 - samples/sec: 26.03 - lr: 0.025000 2021-01-25 13:32:01,546 epoch 112 - iter 55/111 - loss 0.70097128 - samples/sec: 29.97 - lr: 0.025000 2021-01-25 13:32:08,411 epoch 112 - iter 66/111 - loss 0.71806321 - samples/sec: 25.65 - lr: 0.025000 2021-01-25 13:32:15,532 epoch 112 - iter 77/111 - loss 0.72194873 - samples/sec: 24.73 - lr: 0.025000 2021-01-25 13:32:21,571 epoch 112 - iter 88/111 - loss 0.73156702 - samples/sec: 29.16 - lr: 0.025000 2021-01-25 13:32:26,789 epoch 112 - iter 99/111 - loss 0.70580896 - samples/sec: 33.74 - lr: 0.025000 2021-01-25 13:32:33,077 epoch 112 - iter 110/111 - loss 0.72110457 - samples/sec: 28.00 - lr: 0.025000 2021-01-25 13:32:33,176 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:32:33,177 EPOCH 112 done: loss 0.7150 - lr 0.0250000 2021-01-25 13:32:33,178 BAD EPOCHS (no improvement): 2 2021-01-25 13:33:05,430 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:33:11,898 epoch 113 - iter 11/111 - loss 0.82508569 - samples/sec: 27.33 - lr: 0.025000 2021-01-25 13:33:17,996 epoch 113 - iter 22/111 - loss 0.77450722 - samples/sec: 28.88 - lr: 0.025000 2021-01-25 13:33:24,907 epoch 113 - iter 33/111 - loss 0.73547285 - samples/sec: 25.47 - lr: 0.025000 2021-01-25 13:33:30,943 epoch 113 - iter 44/111 - loss 0.69540778 - samples/sec: 29.17 - lr: 0.025000 2021-01-25 13:33:37,668 epoch 113 - iter 55/111 - loss 0.71267423 - samples/sec: 26.19 - lr: 0.025000 2021-01-25 13:33:43,521 epoch 113 - iter 66/111 - loss 0.70831058 - samples/sec: 30.09 - lr: 0.025000 2021-01-25 13:33:50,909 epoch 113 - iter 77/111 - loss 0.71425138 - samples/sec: 23.83 - lr: 0.025000 2021-01-25 13:33:57,071 epoch 113 - iter 88/111 - loss 0.70778101 - samples/sec: 28.57 - lr: 0.025000 2021-01-25 13:34:03,304 epoch 113 - iter 99/111 - loss 0.72269965 - samples/sec: 28.25 - lr: 0.025000 2021-01-25 13:34:08,905 epoch 113 - iter 110/111 - loss 0.72732831 - samples/sec: 31.43 - lr: 0.025000 2021-01-25 13:34:09,027 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:34:09,028 EPOCH 113 done: loss 0.7226 - lr 0.0250000 2021-01-25 13:34:09,030 BAD EPOCHS (no improvement): 3 2021-01-25 13:34:41,280 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:34:47,436 epoch 114 - iter 11/111 - loss 0.56088674 - samples/sec: 28.73 - lr: 0.025000 2021-01-25 13:34:55,244 epoch 114 - iter 22/111 - loss 0.60757464 - samples/sec: 22.55 - lr: 0.025000 2021-01-25 13:35:02,043 epoch 114 - iter 33/111 - loss 0.68038769 - samples/sec: 25.90 - lr: 0.025000 2021-01-25 13:35:08,683 epoch 114 - iter 44/111 - loss 0.63854400 - samples/sec: 26.52 - lr: 0.025000 2021-01-25 13:35:14,493 epoch 114 - iter 55/111 - loss 0.64930110 - samples/sec: 30.30 - lr: 0.025000 2021-01-25 13:35:20,725 epoch 114 - iter 66/111 - loss 0.66459776 - samples/sec: 28.25 - lr: 0.025000 2021-01-25 13:35:27,633 epoch 114 - iter 77/111 - loss 0.66347560 - samples/sec: 25.49 - lr: 0.025000 2021-01-25 13:35:34,361 epoch 114 - iter 88/111 - loss 0.64584362 - samples/sec: 26.17 - lr: 0.025000 2021-01-25 13:35:40,596 epoch 114 - iter 99/111 - loss 0.64234639 - samples/sec: 28.24 - lr: 0.025000 2021-01-25 13:35:46,886 epoch 114 - iter 110/111 - loss 0.66038458 - samples/sec: 27.99 - lr: 0.025000 2021-01-25 13:35:46,954 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:35:46,955 EPOCH 114 done: loss 0.6551 - lr 0.0250000 2021-01-25 13:35:46,957 BAD EPOCHS (no improvement): 0 2021-01-25 13:36:21,227 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:36:26,718 epoch 115 - iter 11/111 - loss 0.56298585 - samples/sec: 32.22 - lr: 0.025000 2021-01-25 13:36:32,987 epoch 115 - iter 22/111 - loss 0.64012238 - samples/sec: 28.09 - lr: 0.025000 2021-01-25 13:36:40,541 epoch 115 - iter 33/111 - loss 0.61518407 - samples/sec: 23.31 - lr: 0.025000 2021-01-25 13:36:47,604 epoch 115 - iter 44/111 - loss 0.63969870 - samples/sec: 24.93 - lr: 0.025000 2021-01-25 13:36:54,755 epoch 115 - iter 55/111 - loss 0.65050368 - samples/sec: 24.62 - lr: 0.025000 2021-01-25 13:37:01,124 epoch 115 - iter 66/111 - loss 0.64726682 - samples/sec: 27.64 - lr: 0.025000 2021-01-25 13:37:07,380 epoch 115 - iter 77/111 - loss 0.66146871 - samples/sec: 28.15 - lr: 0.025000 2021-01-25 13:37:13,862 epoch 115 - iter 88/111 - loss 0.68050893 - samples/sec: 27.16 - lr: 0.025000 2021-01-25 13:37:19,189 epoch 115 - iter 99/111 - loss 0.68166747 - samples/sec: 33.06 - lr: 0.025000 2021-01-25 13:37:24,692 epoch 115 - iter 110/111 - loss 0.68001638 - samples/sec: 32.00 - lr: 0.025000 2021-01-25 13:37:24,858 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:37:24,859 EPOCH 115 done: loss 0.6765 - lr 0.0250000 2021-01-25 13:37:24,860 BAD EPOCHS (no improvement): 1 2021-01-25 13:38:03,000 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:38:09,511 epoch 116 - iter 11/111 - loss 0.72998188 - samples/sec: 27.15 - lr: 0.025000 2021-01-25 13:38:16,079 epoch 116 - iter 22/111 - loss 0.72190877 - samples/sec: 26.81 - lr: 0.025000 2021-01-25 13:38:22,810 epoch 116 - iter 33/111 - loss 0.68275763 - samples/sec: 26.17 - lr: 0.025000 2021-01-25 13:38:29,623 epoch 116 - iter 44/111 - loss 0.69687710 - samples/sec: 25.84 - lr: 0.025000 2021-01-25 13:38:36,056 epoch 116 - iter 55/111 - loss 0.71078484 - samples/sec: 27.37 - lr: 0.025000 2021-01-25 13:38:42,246 epoch 116 - iter 66/111 - loss 0.70068730 - samples/sec: 28.45 - lr: 0.025000 2021-01-25 13:38:48,910 epoch 116 - iter 77/111 - loss 0.71536316 - samples/sec: 26.42 - lr: 0.025000 2021-01-25 13:38:54,853 epoch 116 - iter 88/111 - loss 0.71234254 - samples/sec: 29.62 - lr: 0.025000 2021-01-25 13:39:01,133 epoch 116 - iter 99/111 - loss 0.70125230 - samples/sec: 28.04 - lr: 0.025000 2021-01-25 13:39:06,795 epoch 116 - iter 110/111 - loss 0.70607526 - samples/sec: 31.10 - lr: 0.025000 2021-01-25 13:39:06,954 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:39:06,955 EPOCH 116 done: loss 0.7094 - lr 0.0250000 2021-01-25 13:39:06,957 BAD EPOCHS (no improvement): 2 2021-01-25 13:39:47,840 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:39:53,677 epoch 117 - iter 11/111 - loss 0.85482086 - samples/sec: 30.30 - lr: 0.025000 2021-01-25 13:40:00,004 epoch 117 - iter 22/111 - loss 0.70978049 - samples/sec: 27.83 - lr: 0.025000 2021-01-25 13:40:06,202 epoch 117 - iter 33/111 - loss 0.72454160 - samples/sec: 28.41 - lr: 0.025000 2021-01-25 13:40:12,307 epoch 117 - iter 44/111 - loss 0.70538759 - samples/sec: 28.85 - lr: 0.025000 2021-01-25 13:40:18,742 epoch 117 - iter 55/111 - loss 0.68637878 - samples/sec: 27.36 - lr: 0.025000 2021-01-25 13:40:26,484 epoch 117 - iter 66/111 - loss 0.68752330 - samples/sec: 22.74 - lr: 0.025000 2021-01-25 13:40:33,338 epoch 117 - iter 77/111 - loss 0.68788088 - samples/sec: 25.69 - lr: 0.025000 2021-01-25 13:40:39,034 epoch 117 - iter 88/111 - loss 0.68027635 - samples/sec: 30.91 - lr: 0.025000 2021-01-25 13:40:45,570 epoch 117 - iter 99/111 - loss 0.66959048 - samples/sec: 26.94 - lr: 0.025000 2021-01-25 13:40:51,293 epoch 117 - iter 110/111 - loss 0.67828720 - samples/sec: 30.77 - lr: 0.025000 2021-01-25 13:40:51,415 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:40:51,416 EPOCH 117 done: loss 0.6735 - lr 0.0250000 2021-01-25 13:40:51,418 BAD EPOCHS (no improvement): 3 2021-01-25 13:41:21,981 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:41:28,035 epoch 118 - iter 11/111 - loss 0.61396825 - samples/sec: 29.21 - lr: 0.025000 2021-01-25 13:41:38,319 epoch 118 - iter 22/111 - loss 0.56253364 - samples/sec: 27.68 - lr: 0.025000 2021-01-25 13:41:44,866 epoch 118 - iter 33/111 - loss 0.62176757 - samples/sec: 26.90 - lr: 0.025000 2021-01-25 13:41:51,473 epoch 118 - iter 44/111 - loss 0.60675451 - samples/sec: 26.65 - lr: 0.025000 2021-01-25 13:41:58,528 epoch 118 - iter 55/111 - loss 0.64548492 - samples/sec: 24.95 - lr: 0.025000 2021-01-25 13:42:04,943 epoch 118 - iter 66/111 - loss 0.64572965 - samples/sec: 27.45 - lr: 0.025000 2021-01-25 13:42:11,243 epoch 118 - iter 77/111 - loss 0.64876765 - samples/sec: 27.95 - lr: 0.025000 2021-01-25 13:42:19,232 epoch 118 - iter 88/111 - loss 0.65159514 - samples/sec: 22.03 - lr: 0.025000 2021-01-25 13:42:25,180 epoch 118 - iter 99/111 - loss 0.65371615 - samples/sec: 29.60 - lr: 0.025000 2021-01-25 13:42:30,047 epoch 118 - iter 110/111 - loss 0.65268896 - samples/sec: 36.18 - lr: 0.025000 2021-01-25 13:42:30,138 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:42:30,139 EPOCH 118 done: loss 0.6468 - lr 0.0250000 2021-01-25 13:42:30,140 BAD EPOCHS (no improvement): 0 2021-01-25 13:43:02,756 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:43:08,290 epoch 119 - iter 11/111 - loss 0.71942934 - samples/sec: 31.98 - lr: 0.025000 2021-01-25 13:43:15,259 epoch 119 - iter 22/111 - loss 0.67490881 - samples/sec: 25.26 - lr: 0.025000 2021-01-25 13:43:21,144 epoch 119 - iter 33/111 - loss 0.67634846 - samples/sec: 29.92 - lr: 0.025000 2021-01-25 13:43:27,385 epoch 119 - iter 44/111 - loss 0.65903231 - samples/sec: 28.21 - lr: 0.025000 2021-01-25 13:43:34,014 epoch 119 - iter 55/111 - loss 0.67427227 - samples/sec: 26.56 - lr: 0.025000 2021-01-25 13:43:41,119 epoch 119 - iter 66/111 - loss 0.63346933 - samples/sec: 24.78 - lr: 0.025000 2021-01-25 13:43:49,156 epoch 119 - iter 77/111 - loss 0.63288591 - samples/sec: 21.91 - lr: 0.025000 2021-01-25 13:43:55,571 epoch 119 - iter 88/111 - loss 0.64751344 - samples/sec: 27.44 - lr: 0.025000 2021-01-25 13:44:01,338 epoch 119 - iter 99/111 - loss 0.64728304 - samples/sec: 30.53 - lr: 0.025000 2021-01-25 13:44:07,865 epoch 119 - iter 110/111 - loss 0.64798920 - samples/sec: 26.97 - lr: 0.025000 2021-01-25 13:44:08,003 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:44:08,005 EPOCH 119 done: loss 0.6583 - lr 0.0250000 2021-01-25 13:44:08,006 BAD EPOCHS (no improvement): 1 2021-01-25 13:44:40,563 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:44:46,268 epoch 120 - iter 11/111 - loss 0.70810749 - samples/sec: 31.02 - lr: 0.025000 2021-01-25 13:44:52,572 epoch 120 - iter 22/111 - loss 0.67887612 - samples/sec: 27.93 - lr: 0.025000 2021-01-25 13:44:59,280 epoch 120 - iter 33/111 - loss 0.61781476 - samples/sec: 26.25 - lr: 0.025000 2021-01-25 13:45:05,912 epoch 120 - iter 44/111 - loss 0.60051820 - samples/sec: 26.55 - lr: 0.025000 2021-01-25 13:45:12,975 epoch 120 - iter 55/111 - loss 0.64317197 - samples/sec: 24.93 - lr: 0.025000 2021-01-25 13:45:19,825 epoch 120 - iter 66/111 - loss 0.62872899 - samples/sec: 25.71 - lr: 0.025000 2021-01-25 13:45:27,856 epoch 120 - iter 77/111 - loss 0.65401933 - samples/sec: 21.92 - lr: 0.025000 2021-01-25 13:45:33,315 epoch 120 - iter 88/111 - loss 0.66591052 - samples/sec: 32.25 - lr: 0.025000 2021-01-25 13:45:39,091 epoch 120 - iter 99/111 - loss 0.66697941 - samples/sec: 30.49 - lr: 0.025000 2021-01-25 13:45:45,006 epoch 120 - iter 110/111 - loss 0.66261771 - samples/sec: 29.77 - lr: 0.025000 2021-01-25 13:45:45,075 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:45:45,076 EPOCH 120 done: loss 0.6758 - lr 0.0250000 2021-01-25 13:45:45,078 BAD EPOCHS (no improvement): 2 2021-01-25 13:46:19,536 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:46:25,596 epoch 121 - iter 11/111 - loss 0.58798959 - samples/sec: 29.17 - lr: 0.025000 2021-01-25 13:46:31,806 epoch 121 - iter 22/111 - loss 0.61343662 - samples/sec: 28.35 - lr: 0.025000 2021-01-25 13:46:38,477 epoch 121 - iter 33/111 - loss 0.60312356 - samples/sec: 26.39 - lr: 0.025000 2021-01-25 13:46:45,808 epoch 121 - iter 44/111 - loss 0.61619065 - samples/sec: 24.01 - lr: 0.025000 2021-01-25 13:46:53,028 epoch 121 - iter 55/111 - loss 0.60864824 - samples/sec: 24.38 - lr: 0.025000 2021-01-25 13:46:59,614 epoch 121 - iter 66/111 - loss 0.62845413 - samples/sec: 26.74 - lr: 0.025000 2021-01-25 13:47:05,938 epoch 121 - iter 77/111 - loss 0.66102760 - samples/sec: 27.84 - lr: 0.025000 2021-01-25 13:47:11,077 epoch 121 - iter 88/111 - loss 0.65753494 - samples/sec: 34.27 - lr: 0.025000 2021-01-25 13:47:17,313 epoch 121 - iter 99/111 - loss 0.64182588 - samples/sec: 28.24 - lr: 0.025000 2021-01-25 13:47:23,365 epoch 121 - iter 110/111 - loss 0.63073516 - samples/sec: 29.09 - lr: 0.025000 2021-01-25 13:47:23,553 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:47:23,554 EPOCH 121 done: loss 0.6288 - lr 0.0250000 2021-01-25 13:47:23,555 BAD EPOCHS (no improvement): 0 2021-01-25 13:48:04,973 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:48:12,117 epoch 122 - iter 11/111 - loss 0.68116287 - samples/sec: 24.74 - lr: 0.025000 2021-01-25 13:48:19,672 epoch 122 - iter 22/111 - loss 0.66497344 - samples/sec: 23.31 - lr: 0.025000 2021-01-25 13:48:26,240 epoch 122 - iter 33/111 - loss 0.64456542 - samples/sec: 26.81 - lr: 0.025000 2021-01-25 13:48:34,192 epoch 122 - iter 44/111 - loss 0.62877654 - samples/sec: 22.14 - lr: 0.025000 2021-01-25 13:48:42,013 epoch 122 - iter 55/111 - loss 0.62691633 - samples/sec: 22.51 - lr: 0.025000 2021-01-25 13:48:48,146 epoch 122 - iter 66/111 - loss 0.64545234 - samples/sec: 28.71 - lr: 0.025000 2021-01-25 13:48:54,541 epoch 122 - iter 77/111 - loss 0.65103555 - samples/sec: 27.53 - lr: 0.025000 2021-01-25 13:48:59,939 epoch 122 - iter 88/111 - loss 0.64988670 - samples/sec: 32.62 - lr: 0.025000 2021-01-25 13:49:05,337 epoch 122 - iter 99/111 - loss 0.64898939 - samples/sec: 32.62 - lr: 0.025000 2021-01-25 13:49:11,099 epoch 122 - iter 110/111 - loss 0.64195849 - samples/sec: 30.56 - lr: 0.025000 2021-01-25 13:49:11,231 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:49:11,232 EPOCH 122 done: loss 0.6368 - lr 0.0250000 2021-01-25 13:49:11,234 BAD EPOCHS (no improvement): 1 2021-01-25 13:49:52,399 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:49:58,079 epoch 123 - iter 11/111 - loss 0.79830904 - samples/sec: 31.15 - lr: 0.025000 2021-01-25 13:50:03,699 epoch 123 - iter 22/111 - loss 0.70631330 - samples/sec: 31.33 - lr: 0.025000 2021-01-25 13:50:09,623 epoch 123 - iter 33/111 - loss 0.67264597 - samples/sec: 29.72 - lr: 0.025000 2021-01-25 13:50:16,925 epoch 123 - iter 44/111 - loss 0.68767643 - samples/sec: 24.11 - lr: 0.025000 2021-01-25 13:50:23,832 epoch 123 - iter 55/111 - loss 0.66567398 - samples/sec: 25.49 - lr: 0.025000 2021-01-25 13:50:31,098 epoch 123 - iter 66/111 - loss 0.64204170 - samples/sec: 24.23 - lr: 0.025000 2021-01-25 13:50:38,815 epoch 123 - iter 77/111 - loss 0.62967690 - samples/sec: 22.82 - lr: 0.025000 2021-01-25 13:50:45,005 epoch 123 - iter 88/111 - loss 0.61772293 - samples/sec: 28.45 - lr: 0.025000 2021-01-25 13:50:51,261 epoch 123 - iter 99/111 - loss 0.61946298 - samples/sec: 28.14 - lr: 0.025000 2021-01-25 13:50:57,816 epoch 123 - iter 110/111 - loss 0.64053811 - samples/sec: 26.86 - lr: 0.025000 2021-01-25 13:50:57,909 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:50:57,911 EPOCH 123 done: loss 0.6438 - lr 0.0250000 2021-01-25 13:50:57,912 BAD EPOCHS (no improvement): 2 2021-01-25 13:51:36,641 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:51:42,635 epoch 124 - iter 11/111 - loss 0.61700277 - samples/sec: 29.52 - lr: 0.025000 2021-01-25 13:51:48,377 epoch 124 - iter 22/111 - loss 0.58532127 - samples/sec: 30.66 - lr: 0.025000 2021-01-25 13:51:55,249 epoch 124 - iter 33/111 - loss 0.60009717 - samples/sec: 25.62 - lr: 0.025000 2021-01-25 13:52:03,238 epoch 124 - iter 44/111 - loss 0.57114009 - samples/sec: 22.04 - lr: 0.025000 2021-01-25 13:52:10,291 epoch 124 - iter 55/111 - loss 0.57621487 - samples/sec: 24.96 - lr: 0.025000 2021-01-25 13:52:17,507 epoch 124 - iter 66/111 - loss 0.56355515 - samples/sec: 24.39 - lr: 0.025000 2021-01-25 13:52:24,007 epoch 124 - iter 77/111 - loss 0.57433285 - samples/sec: 27.09 - lr: 0.025000 2021-01-25 13:52:30,051 epoch 124 - iter 88/111 - loss 0.58146794 - samples/sec: 29.13 - lr: 0.025000 2021-01-25 13:52:35,681 epoch 124 - iter 99/111 - loss 0.59278749 - samples/sec: 31.27 - lr: 0.025000 2021-01-25 13:52:41,883 epoch 124 - iter 110/111 - loss 0.59244975 - samples/sec: 28.39 - lr: 0.025000 2021-01-25 13:52:42,057 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:52:42,059 EPOCH 124 done: loss 0.5905 - lr 0.0250000 2021-01-25 13:52:42,061 BAD EPOCHS (no improvement): 0 2021-01-25 13:53:16,947 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:53:22,941 epoch 125 - iter 11/111 - loss 0.65947149 - samples/sec: 29.50 - lr: 0.025000 2021-01-25 13:53:29,163 epoch 125 - iter 22/111 - loss 0.59695565 - samples/sec: 28.30 - lr: 0.025000 2021-01-25 13:53:36,192 epoch 125 - iter 33/111 - loss 0.56540620 - samples/sec: 25.05 - lr: 0.025000 2021-01-25 13:53:42,347 epoch 125 - iter 44/111 - loss 0.57762860 - samples/sec: 28.61 - lr: 0.025000 2021-01-25 13:53:50,621 epoch 125 - iter 55/111 - loss 0.57077052 - samples/sec: 21.27 - lr: 0.025000 2021-01-25 13:53:56,935 epoch 125 - iter 66/111 - loss 0.57226110 - samples/sec: 27.89 - lr: 0.025000 2021-01-25 13:54:03,903 epoch 125 - iter 77/111 - loss 0.57960982 - samples/sec: 25.27 - lr: 0.025000 2021-01-25 13:54:10,143 epoch 125 - iter 88/111 - loss 0.57841610 - samples/sec: 28.21 - lr: 0.025000 2021-01-25 13:54:15,426 epoch 125 - iter 99/111 - loss 0.57296153 - samples/sec: 33.33 - lr: 0.025000 2021-01-25 13:54:20,822 epoch 125 - iter 110/111 - loss 0.56136767 - samples/sec: 32.63 - lr: 0.025000 2021-01-25 13:54:20,940 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:54:20,941 EPOCH 125 done: loss 0.5869 - lr 0.0250000 2021-01-25 13:54:20,943 BAD EPOCHS (no improvement): 0 2021-01-25 13:54:58,751 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:55:05,450 epoch 126 - iter 11/111 - loss 0.52473863 - samples/sec: 26.39 - lr: 0.025000 2021-01-25 13:55:12,206 epoch 126 - iter 22/111 - loss 0.48462694 - samples/sec: 26.06 - lr: 0.025000 2021-01-25 13:55:18,806 epoch 126 - iter 33/111 - loss 0.54067218 - samples/sec: 26.67 - lr: 0.025000 2021-01-25 13:55:26,222 epoch 126 - iter 44/111 - loss 0.59235639 - samples/sec: 23.74 - lr: 0.025000 2021-01-25 13:55:33,642 epoch 126 - iter 55/111 - loss 0.60935410 - samples/sec: 23.73 - lr: 0.025000 2021-01-25 13:55:39,428 epoch 126 - iter 66/111 - loss 0.60552484 - samples/sec: 30.43 - lr: 0.025000 2021-01-25 13:55:46,221 epoch 126 - iter 77/111 - loss 0.64459800 - samples/sec: 25.92 - lr: 0.025000 2021-01-25 13:55:52,370 epoch 126 - iter 88/111 - loss 0.63451168 - samples/sec: 28.63 - lr: 0.025000 2021-01-25 13:55:58,621 epoch 126 - iter 99/111 - loss 0.64048163 - samples/sec: 28.16 - lr: 0.025000 2021-01-25 13:56:04,441 epoch 126 - iter 110/111 - loss 0.65040921 - samples/sec: 30.25 - lr: 0.025000 2021-01-25 13:56:04,580 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:56:04,581 EPOCH 126 done: loss 0.6447 - lr 0.0250000 2021-01-25 13:56:04,585 BAD EPOCHS (no improvement): 1 2021-01-25 13:56:44,261 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:56:50,195 epoch 127 - iter 11/111 - loss 0.74406024 - samples/sec: 29.81 - lr: 0.025000 2021-01-25 13:56:56,375 epoch 127 - iter 22/111 - loss 0.70273987 - samples/sec: 28.49 - lr: 0.025000 2021-01-25 13:57:03,179 epoch 127 - iter 33/111 - loss 0.64999823 - samples/sec: 25.88 - lr: 0.025000 2021-01-25 13:57:10,717 epoch 127 - iter 44/111 - loss 0.63081680 - samples/sec: 23.35 - lr: 0.025000 2021-01-25 13:57:17,354 epoch 127 - iter 55/111 - loss 0.62215556 - samples/sec: 26.53 - lr: 0.025000 2021-01-25 13:57:23,594 epoch 127 - iter 66/111 - loss 0.61392723 - samples/sec: 28.22 - lr: 0.025000 2021-01-25 13:57:32,861 epoch 127 - iter 77/111 - loss 0.62140490 - samples/sec: 19.00 - lr: 0.025000 2021-01-25 13:57:39,075 epoch 127 - iter 88/111 - loss 0.63513454 - samples/sec: 28.33 - lr: 0.025000 2021-01-25 13:57:44,465 epoch 127 - iter 99/111 - loss 0.62767774 - samples/sec: 32.67 - lr: 0.025000 2021-01-25 13:57:50,622 epoch 127 - iter 110/111 - loss 0.62037031 - samples/sec: 28.59 - lr: 0.025000 2021-01-25 13:57:50,701 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:57:50,702 EPOCH 127 done: loss 0.6150 - lr 0.0250000 2021-01-25 13:57:50,704 BAD EPOCHS (no improvement): 2 2021-01-25 13:58:24,596 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 13:58:30,635 epoch 128 - iter 11/111 - loss 0.74410861 - samples/sec: 29.28 - lr: 0.025000 2021-01-25 13:58:37,607 epoch 128 - iter 22/111 - loss 0.65902320 - samples/sec: 25.25 - lr: 0.025000 2021-01-25 13:58:44,143 epoch 128 - iter 33/111 - loss 0.58010160 - samples/sec: 26.94 - lr: 0.025000 2021-01-25 13:58:51,939 epoch 128 - iter 44/111 - loss 0.61621079 - samples/sec: 22.58 - lr: 0.025000 2021-01-25 13:58:58,679 epoch 128 - iter 55/111 - loss 0.61133996 - samples/sec: 26.12 - lr: 0.025000 2021-01-25 13:59:05,882 epoch 128 - iter 66/111 - loss 0.62836178 - samples/sec: 24.44 - lr: 0.025000 2021-01-25 13:59:11,684 epoch 128 - iter 77/111 - loss 0.60741993 - samples/sec: 30.35 - lr: 0.025000 2021-01-25 13:59:19,197 epoch 128 - iter 88/111 - loss 0.61006476 - samples/sec: 23.44 - lr: 0.025000 2021-01-25 13:59:24,615 epoch 128 - iter 99/111 - loss 0.59143774 - samples/sec: 32.50 - lr: 0.025000 2021-01-25 13:59:30,420 epoch 128 - iter 110/111 - loss 0.59331026 - samples/sec: 30.33 - lr: 0.025000 2021-01-25 13:59:30,504 ---------------------------------------------------------------------------------------------------- 2021-01-25 13:59:30,505 EPOCH 128 done: loss 0.5880 - lr 0.0250000 2021-01-25 13:59:30,506 BAD EPOCHS (no improvement): 3 2021-01-25 14:00:04,279 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:00:10,829 epoch 129 - iter 11/111 - loss 0.68741938 - samples/sec: 26.99 - lr: 0.025000 2021-01-25 14:00:17,480 epoch 129 - iter 22/111 - loss 0.63012544 - samples/sec: 26.47 - lr: 0.025000 2021-01-25 14:00:24,022 epoch 129 - iter 33/111 - loss 0.58932896 - samples/sec: 26.91 - lr: 0.025000 2021-01-25 14:00:30,731 epoch 129 - iter 44/111 - loss 0.56153652 - samples/sec: 26.24 - lr: 0.025000 2021-01-25 14:00:37,726 epoch 129 - iter 55/111 - loss 0.58815567 - samples/sec: 25.18 - lr: 0.025000 2021-01-25 14:00:44,698 epoch 129 - iter 66/111 - loss 0.57535938 - samples/sec: 25.26 - lr: 0.025000 2021-01-25 14:00:51,295 epoch 129 - iter 77/111 - loss 0.58570516 - samples/sec: 26.69 - lr: 0.025000 2021-01-25 14:00:57,578 epoch 129 - iter 88/111 - loss 0.56886957 - samples/sec: 28.02 - lr: 0.025000 2021-01-25 14:01:03,354 epoch 129 - iter 99/111 - loss 0.58495755 - samples/sec: 30.48 - lr: 0.025000 2021-01-25 14:01:09,593 epoch 129 - iter 110/111 - loss 0.60014742 - samples/sec: 28.22 - lr: 0.025000 2021-01-25 14:01:09,677 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:01:09,678 EPOCH 129 done: loss 0.5950 - lr 0.0250000 Epoch 129: reducing learning rate of group 0 to 1.2500e-02. 2021-01-25 14:01:09,680 BAD EPOCHS (no improvement): 4 2021-01-25 14:01:47,313 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:01:54,386 epoch 130 - iter 11/111 - loss 0.61598990 - samples/sec: 25.00 - lr: 0.012500 2021-01-25 14:02:01,784 epoch 130 - iter 22/111 - loss 0.60945936 - samples/sec: 23.80 - lr: 0.012500 2021-01-25 14:02:09,131 epoch 130 - iter 33/111 - loss 0.66073855 - samples/sec: 23.96 - lr: 0.012500 2021-01-25 14:02:15,756 epoch 130 - iter 44/111 - loss 0.60138401 - samples/sec: 26.58 - lr: 0.012500 2021-01-25 14:02:21,795 epoch 130 - iter 55/111 - loss 0.58092661 - samples/sec: 29.16 - lr: 0.012500 2021-01-25 14:02:28,127 epoch 130 - iter 66/111 - loss 0.57135477 - samples/sec: 27.81 - lr: 0.012500 2021-01-25 14:02:33,990 epoch 130 - iter 77/111 - loss 0.55182214 - samples/sec: 30.03 - lr: 0.012500 2021-01-25 14:02:40,759 epoch 130 - iter 88/111 - loss 0.56060971 - samples/sec: 26.01 - lr: 0.012500 2021-01-25 14:02:46,636 epoch 130 - iter 99/111 - loss 0.55722380 - samples/sec: 29.96 - lr: 0.012500 2021-01-25 14:02:52,345 epoch 130 - iter 110/111 - loss 0.54765077 - samples/sec: 30.85 - lr: 0.012500 2021-01-25 14:02:52,438 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:02:52,439 EPOCH 130 done: loss 0.5436 - lr 0.0125000 2021-01-25 14:02:52,440 BAD EPOCHS (no improvement): 0 2021-01-25 14:03:32,025 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:03:38,321 epoch 131 - iter 11/111 - loss 0.58030411 - samples/sec: 28.09 - lr: 0.012500 2021-01-25 14:03:43,970 epoch 131 - iter 22/111 - loss 0.51831507 - samples/sec: 31.17 - lr: 0.012500 2021-01-25 14:03:50,744 epoch 131 - iter 33/111 - loss 0.54572397 - samples/sec: 25.99 - lr: 0.012500 2021-01-25 14:03:57,517 epoch 131 - iter 44/111 - loss 0.53979178 - samples/sec: 26.00 - lr: 0.012500 2021-01-25 14:04:03,978 epoch 131 - iter 55/111 - loss 0.55793165 - samples/sec: 27.25 - lr: 0.012500 2021-01-25 14:04:10,621 epoch 131 - iter 66/111 - loss 0.54352260 - samples/sec: 26.51 - lr: 0.012500 2021-01-25 14:04:17,454 epoch 131 - iter 77/111 - loss 0.56269143 - samples/sec: 25.77 - lr: 0.012500 2021-01-25 14:04:24,515 epoch 131 - iter 88/111 - loss 0.57292043 - samples/sec: 24.93 - lr: 0.012500 2021-01-25 14:04:30,847 epoch 131 - iter 99/111 - loss 0.57006812 - samples/sec: 27.80 - lr: 0.012500 2021-01-25 14:04:35,734 epoch 131 - iter 110/111 - loss 0.57321482 - samples/sec: 36.03 - lr: 0.012500 2021-01-25 14:04:35,773 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:04:35,775 EPOCH 131 done: loss 0.5681 - lr 0.0125000 2021-01-25 14:04:35,776 BAD EPOCHS (no improvement): 1 2021-01-25 14:05:07,147 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:05:13,620 epoch 132 - iter 11/111 - loss 0.52959698 - samples/sec: 27.31 - lr: 0.012500 2021-01-25 14:05:23,530 epoch 132 - iter 22/111 - loss 0.52519627 - samples/sec: 27.80 - lr: 0.012500 2021-01-25 14:05:29,950 epoch 132 - iter 33/111 - loss 0.56608403 - samples/sec: 27.42 - lr: 0.012500 2021-01-25 14:05:37,127 epoch 132 - iter 44/111 - loss 0.55661541 - samples/sec: 24.53 - lr: 0.012500 2021-01-25 14:05:43,521 epoch 132 - iter 55/111 - loss 0.55378752 - samples/sec: 27.54 - lr: 0.012500 2021-01-25 14:05:50,414 epoch 132 - iter 66/111 - loss 0.56385819 - samples/sec: 25.54 - lr: 0.012500 2021-01-25 14:05:56,963 epoch 132 - iter 77/111 - loss 0.54243020 - samples/sec: 26.88 - lr: 0.012500 2021-01-25 14:06:03,652 epoch 132 - iter 88/111 - loss 0.54372444 - samples/sec: 26.32 - lr: 0.012500 2021-01-25 14:06:09,441 epoch 132 - iter 99/111 - loss 0.53199278 - samples/sec: 30.42 - lr: 0.012500 2021-01-25 14:06:14,701 epoch 132 - iter 110/111 - loss 0.54690942 - samples/sec: 33.47 - lr: 0.012500 2021-01-25 14:06:14,871 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:06:14,872 EPOCH 132 done: loss 0.5420 - lr 0.0125000 2021-01-25 14:06:14,874 BAD EPOCHS (no improvement): 0 2021-01-25 14:06:52,420 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:06:58,524 epoch 133 - iter 11/111 - loss 0.41649119 - samples/sec: 28.98 - lr: 0.012500 2021-01-25 14:07:03,658 epoch 133 - iter 22/111 - loss 0.36768959 - samples/sec: 34.29 - lr: 0.012500 2021-01-25 14:07:10,207 epoch 133 - iter 33/111 - loss 0.41266027 - samples/sec: 26.89 - lr: 0.012500 2021-01-25 14:07:17,234 epoch 133 - iter 44/111 - loss 0.46493986 - samples/sec: 25.06 - lr: 0.012500 2021-01-25 14:07:23,290 epoch 133 - iter 55/111 - loss 0.45325970 - samples/sec: 29.07 - lr: 0.012500 2021-01-25 14:07:30,116 epoch 133 - iter 66/111 - loss 0.44811419 - samples/sec: 25.80 - lr: 0.012500 2021-01-25 14:07:37,658 epoch 133 - iter 77/111 - loss 0.46157333 - samples/sec: 23.34 - lr: 0.012500 2021-01-25 14:07:44,375 epoch 133 - iter 88/111 - loss 0.48096643 - samples/sec: 26.21 - lr: 0.012500 2021-01-25 14:07:50,176 epoch 133 - iter 99/111 - loss 0.50702132 - samples/sec: 30.36 - lr: 0.012500 2021-01-25 14:07:56,225 epoch 133 - iter 110/111 - loss 0.51824152 - samples/sec: 29.11 - lr: 0.012500 2021-01-25 14:07:56,593 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:07:56,594 EPOCH 133 done: loss 0.5196 - lr 0.0125000 2021-01-25 14:07:56,596 BAD EPOCHS (no improvement): 0 2021-01-25 14:08:29,342 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:08:43,869 epoch 134 - iter 11/111 - loss 0.65717183 - samples/sec: 25.65 - lr: 0.012500 2021-01-25 14:08:51,248 epoch 134 - iter 22/111 - loss 0.60834041 - samples/sec: 23.86 - lr: 0.012500 2021-01-25 14:08:58,468 epoch 134 - iter 33/111 - loss 0.58977554 - samples/sec: 24.38 - lr: 0.012500 2021-01-25 14:09:05,137 epoch 134 - iter 44/111 - loss 0.57483234 - samples/sec: 26.40 - lr: 0.012500 2021-01-25 14:09:11,076 epoch 134 - iter 55/111 - loss 0.56873697 - samples/sec: 29.65 - lr: 0.012500 2021-01-25 14:09:17,105 epoch 134 - iter 66/111 - loss 0.56888056 - samples/sec: 29.20 - lr: 0.012500 2021-01-25 14:09:24,095 epoch 134 - iter 77/111 - loss 0.56223808 - samples/sec: 25.19 - lr: 0.012500 2021-01-25 14:09:29,513 epoch 134 - iter 88/111 - loss 0.56037086 - samples/sec: 32.50 - lr: 0.012500 2021-01-25 14:09:35,191 epoch 134 - iter 99/111 - loss 0.54957375 - samples/sec: 31.00 - lr: 0.012500 2021-01-25 14:09:40,748 epoch 134 - iter 110/111 - loss 0.53262838 - samples/sec: 31.69 - lr: 0.012500 2021-01-25 14:09:40,818 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:09:40,819 EPOCH 134 done: loss 0.5278 - lr 0.0125000 2021-01-25 14:09:40,821 BAD EPOCHS (no improvement): 1 2021-01-25 14:10:13,027 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:10:18,960 epoch 135 - iter 11/111 - loss 0.52025008 - samples/sec: 29.80 - lr: 0.012500 2021-01-25 14:10:25,200 epoch 135 - iter 22/111 - loss 0.51201549 - samples/sec: 28.22 - lr: 0.012500 2021-01-25 14:10:33,758 epoch 135 - iter 33/111 - loss 0.59095611 - samples/sec: 20.57 - lr: 0.012500 2021-01-25 14:10:39,941 epoch 135 - iter 44/111 - loss 0.58347231 - samples/sec: 28.47 - lr: 0.012500 2021-01-25 14:10:46,349 epoch 135 - iter 55/111 - loss 0.56487238 - samples/sec: 27.48 - lr: 0.012500 2021-01-25 14:10:53,963 epoch 135 - iter 66/111 - loss 0.56565157 - samples/sec: 23.12 - lr: 0.012500 2021-01-25 14:10:59,436 epoch 135 - iter 77/111 - loss 0.53918576 - samples/sec: 32.19 - lr: 0.012500 2021-01-25 14:11:05,605 epoch 135 - iter 88/111 - loss 0.52049351 - samples/sec: 28.54 - lr: 0.012500 2021-01-25 14:11:11,015 epoch 135 - iter 99/111 - loss 0.52037110 - samples/sec: 32.55 - lr: 0.012500 2021-01-25 14:11:17,537 epoch 135 - iter 110/111 - loss 0.52271956 - samples/sec: 26.99 - lr: 0.012500 2021-01-25 14:11:17,598 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:11:17,599 EPOCH 135 done: loss 0.5181 - lr 0.0125000 2021-01-25 14:11:17,600 BAD EPOCHS (no improvement): 0 2021-01-25 14:11:50,172 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:12:04,076 epoch 136 - iter 11/111 - loss 0.66750966 - samples/sec: 32.77 - lr: 0.012500 2021-01-25 14:12:10,989 epoch 136 - iter 22/111 - loss 0.54117237 - samples/sec: 25.47 - lr: 0.012500 2021-01-25 14:12:18,161 epoch 136 - iter 33/111 - loss 0.54503648 - samples/sec: 24.55 - lr: 0.012500 2021-01-25 14:12:25,343 epoch 136 - iter 44/111 - loss 0.57029280 - samples/sec: 24.51 - lr: 0.012500 2021-01-25 14:12:32,349 epoch 136 - iter 55/111 - loss 0.56302477 - samples/sec: 25.13 - lr: 0.012500 2021-01-25 14:12:38,208 epoch 136 - iter 66/111 - loss 0.54437011 - samples/sec: 30.05 - lr: 0.012500 2021-01-25 14:12:43,622 epoch 136 - iter 77/111 - loss 0.54132690 - samples/sec: 32.52 - lr: 0.012500 2021-01-25 14:12:49,534 epoch 136 - iter 88/111 - loss 0.52009410 - samples/sec: 29.78 - lr: 0.012500 2021-01-25 14:12:55,743 epoch 136 - iter 99/111 - loss 0.53997797 - samples/sec: 28.35 - lr: 0.012500 2021-01-25 14:13:02,156 epoch 136 - iter 110/111 - loss 0.53740052 - samples/sec: 27.46 - lr: 0.012500 2021-01-25 14:13:02,241 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:13:02,242 EPOCH 136 done: loss 0.5460 - lr 0.0125000 2021-01-25 14:13:02,245 BAD EPOCHS (no improvement): 1 2021-01-25 14:13:34,693 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:13:40,827 epoch 137 - iter 11/111 - loss 0.37447661 - samples/sec: 28.85 - lr: 0.012500 2021-01-25 14:13:46,540 epoch 137 - iter 22/111 - loss 0.42255757 - samples/sec: 30.82 - lr: 0.012500 2021-01-25 14:13:54,109 epoch 137 - iter 33/111 - loss 0.45560228 - samples/sec: 23.26 - lr: 0.012500 2021-01-25 14:14:00,586 epoch 137 - iter 44/111 - loss 0.48572128 - samples/sec: 27.19 - lr: 0.012500 2021-01-25 14:14:07,013 epoch 137 - iter 55/111 - loss 0.47533147 - samples/sec: 27.39 - lr: 0.012500 2021-01-25 14:14:14,350 epoch 137 - iter 66/111 - loss 0.47087201 - samples/sec: 24.00 - lr: 0.012500 2021-01-25 14:14:20,235 epoch 137 - iter 77/111 - loss 0.45870045 - samples/sec: 29.92 - lr: 0.012500 2021-01-25 14:14:25,672 epoch 137 - iter 88/111 - loss 0.46313509 - samples/sec: 32.38 - lr: 0.012500 2021-01-25 14:14:30,981 epoch 137 - iter 99/111 - loss 0.46418444 - samples/sec: 33.17 - lr: 0.012500 2021-01-25 14:14:37,905 epoch 137 - iter 110/111 - loss 0.47312191 - samples/sec: 25.43 - lr: 0.012500 2021-01-25 14:14:37,978 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:14:37,979 EPOCH 137 done: loss 0.5212 - lr 0.0125000 2021-01-25 14:14:37,983 BAD EPOCHS (no improvement): 2 2021-01-25 14:15:13,919 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:15:20,401 epoch 138 - iter 11/111 - loss 0.43607438 - samples/sec: 27.27 - lr: 0.012500 2021-01-25 14:15:27,386 epoch 138 - iter 22/111 - loss 0.46002454 - samples/sec: 25.20 - lr: 0.012500 2021-01-25 14:15:33,899 epoch 138 - iter 33/111 - loss 0.47450392 - samples/sec: 27.04 - lr: 0.012500 2021-01-25 14:15:39,476 epoch 138 - iter 44/111 - loss 0.47389216 - samples/sec: 31.57 - lr: 0.012500 2021-01-25 14:15:46,603 epoch 138 - iter 55/111 - loss 0.48025244 - samples/sec: 24.70 - lr: 0.012500 2021-01-25 14:15:53,126 epoch 138 - iter 66/111 - loss 0.49115474 - samples/sec: 26.99 - lr: 0.012500 2021-01-25 14:15:59,341 epoch 138 - iter 77/111 - loss 0.48860549 - samples/sec: 28.33 - lr: 0.012500 2021-01-25 14:16:06,048 epoch 138 - iter 88/111 - loss 0.47971093 - samples/sec: 26.25 - lr: 0.012500 2021-01-25 14:16:12,109 epoch 138 - iter 99/111 - loss 0.48643696 - samples/sec: 29.05 - lr: 0.012500 2021-01-25 14:16:17,743 epoch 138 - iter 110/111 - loss 0.47220482 - samples/sec: 31.25 - lr: 0.012500 2021-01-25 14:16:17,863 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:16:17,864 EPOCH 138 done: loss 0.4876 - lr 0.0125000 2021-01-25 14:16:17,866 BAD EPOCHS (no improvement): 0 2021-01-25 14:16:52,414 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:16:59,025 epoch 139 - iter 11/111 - loss 0.61815261 - samples/sec: 26.74 - lr: 0.012500 2021-01-25 14:17:05,476 epoch 139 - iter 22/111 - loss 0.52157193 - samples/sec: 27.29 - lr: 0.012500 2021-01-25 14:17:13,097 epoch 139 - iter 33/111 - loss 0.49127991 - samples/sec: 23.10 - lr: 0.012500 2021-01-25 14:17:19,534 epoch 139 - iter 44/111 - loss 0.46624925 - samples/sec: 27.35 - lr: 0.012500 2021-01-25 14:17:26,998 epoch 139 - iter 55/111 - loss 0.49126850 - samples/sec: 23.59 - lr: 0.012500 2021-01-25 14:17:33,481 epoch 139 - iter 66/111 - loss 0.48221333 - samples/sec: 27.16 - lr: 0.012500 2021-01-25 14:17:39,027 epoch 139 - iter 77/111 - loss 0.47143079 - samples/sec: 31.75 - lr: 0.012500 2021-01-25 14:17:44,787 epoch 139 - iter 88/111 - loss 0.47305081 - samples/sec: 30.57 - lr: 0.012500 2021-01-25 14:17:50,873 epoch 139 - iter 99/111 - loss 0.47492835 - samples/sec: 28.93 - lr: 0.012500 2021-01-25 14:17:57,450 epoch 139 - iter 110/111 - loss 0.47038335 - samples/sec: 26.77 - lr: 0.012500 2021-01-25 14:17:57,494 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:17:57,495 EPOCH 139 done: loss 0.4661 - lr 0.0125000 2021-01-25 14:17:57,496 BAD EPOCHS (no improvement): 0 2021-01-25 14:18:36,547 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:18:42,183 epoch 140 - iter 11/111 - loss 0.49920697 - samples/sec: 31.39 - lr: 0.012500 2021-01-25 14:18:48,213 epoch 140 - iter 22/111 - loss 0.48862989 - samples/sec: 29.21 - lr: 0.012500 2021-01-25 14:18:54,178 epoch 140 - iter 33/111 - loss 0.47816874 - samples/sec: 29.52 - lr: 0.012500 2021-01-25 14:19:00,450 epoch 140 - iter 44/111 - loss 0.48933345 - samples/sec: 28.07 - lr: 0.012500 2021-01-25 14:19:07,439 epoch 140 - iter 55/111 - loss 0.46928300 - samples/sec: 25.19 - lr: 0.012500 2021-01-25 14:19:14,968 epoch 140 - iter 66/111 - loss 0.46743580 - samples/sec: 23.38 - lr: 0.012500 2021-01-25 14:19:21,703 epoch 140 - iter 77/111 - loss 0.45977263 - samples/sec: 26.14 - lr: 0.012500 2021-01-25 14:19:28,542 epoch 140 - iter 88/111 - loss 0.48140774 - samples/sec: 25.74 - lr: 0.012500 2021-01-25 14:19:34,363 epoch 140 - iter 99/111 - loss 0.48329782 - samples/sec: 30.25 - lr: 0.012500 2021-01-25 14:19:40,471 epoch 140 - iter 110/111 - loss 0.48854832 - samples/sec: 28.82 - lr: 0.012500 2021-01-25 14:19:40,782 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:19:40,785 EPOCH 140 done: loss 0.5155 - lr 0.0125000 2021-01-25 14:19:40,787 BAD EPOCHS (no improvement): 1 2021-01-25 14:20:21,281 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:20:27,334 epoch 141 - iter 11/111 - loss 0.32942802 - samples/sec: 29.22 - lr: 0.012500 2021-01-25 14:20:33,841 epoch 141 - iter 22/111 - loss 0.47913609 - samples/sec: 27.06 - lr: 0.012500 2021-01-25 14:20:39,626 epoch 141 - iter 33/111 - loss 0.45331875 - samples/sec: 30.44 - lr: 0.012500 2021-01-25 14:20:47,392 epoch 141 - iter 44/111 - loss 0.45383054 - samples/sec: 22.67 - lr: 0.012500 2021-01-25 14:20:55,220 epoch 141 - iter 55/111 - loss 0.45721795 - samples/sec: 22.49 - lr: 0.012500 2021-01-25 14:21:01,595 epoch 141 - iter 66/111 - loss 0.45384956 - samples/sec: 27.62 - lr: 0.012500 2021-01-25 14:21:07,235 epoch 141 - iter 77/111 - loss 0.45619697 - samples/sec: 31.23 - lr: 0.012500 2021-01-25 14:21:13,916 epoch 141 - iter 88/111 - loss 0.44857190 - samples/sec: 26.35 - lr: 0.012500 2021-01-25 14:21:19,498 epoch 141 - iter 99/111 - loss 0.44631384 - samples/sec: 31.55 - lr: 0.012500 2021-01-25 14:21:25,719 epoch 141 - iter 110/111 - loss 0.45270197 - samples/sec: 28.30 - lr: 0.012500 2021-01-25 14:21:25,793 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:21:25,793 EPOCH 141 done: loss 0.4489 - lr 0.0125000 2021-01-25 14:21:25,797 BAD EPOCHS (no improvement): 0 2021-01-25 14:22:06,509 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:22:13,231 epoch 142 - iter 11/111 - loss 0.48726898 - samples/sec: 26.29 - lr: 0.012500 2021-01-25 14:22:20,153 epoch 142 - iter 22/111 - loss 0.45362488 - samples/sec: 25.44 - lr: 0.012500 2021-01-25 14:22:26,263 epoch 142 - iter 33/111 - loss 0.43223831 - samples/sec: 28.82 - lr: 0.012500 2021-01-25 14:22:32,394 epoch 142 - iter 44/111 - loss 0.43348582 - samples/sec: 28.72 - lr: 0.012500 2021-01-25 14:22:39,635 epoch 142 - iter 55/111 - loss 0.44133732 - samples/sec: 24.32 - lr: 0.012500 2021-01-25 14:22:45,553 epoch 142 - iter 66/111 - loss 0.43420036 - samples/sec: 29.75 - lr: 0.012500 2021-01-25 14:22:51,459 epoch 142 - iter 77/111 - loss 0.44172321 - samples/sec: 29.82 - lr: 0.012500 2021-01-25 14:22:57,839 epoch 142 - iter 88/111 - loss 0.45495511 - samples/sec: 27.59 - lr: 0.012500 2021-01-25 14:23:03,819 epoch 142 - iter 99/111 - loss 0.45950819 - samples/sec: 29.45 - lr: 0.012500 2021-01-25 14:23:09,719 epoch 142 - iter 110/111 - loss 0.46650118 - samples/sec: 29.84 - lr: 0.012500 2021-01-25 14:23:09,772 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:23:09,773 EPOCH 142 done: loss 0.4623 - lr 0.0125000 2021-01-25 14:23:09,775 BAD EPOCHS (no improvement): 1 2021-01-25 14:23:42,605 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:23:49,031 epoch 143 - iter 11/111 - loss 0.67751005 - samples/sec: 27.51 - lr: 0.012500 2021-01-25 14:23:56,376 epoch 143 - iter 22/111 - loss 0.57821348 - samples/sec: 23.97 - lr: 0.012500 2021-01-25 14:24:03,123 epoch 143 - iter 33/111 - loss 0.54739928 - samples/sec: 26.10 - lr: 0.012500 2021-01-25 14:24:11,522 epoch 143 - iter 44/111 - loss 0.54872937 - samples/sec: 20.96 - lr: 0.012500 2021-01-25 14:24:18,275 epoch 143 - iter 55/111 - loss 0.53891355 - samples/sec: 26.07 - lr: 0.012500 2021-01-25 14:24:24,157 epoch 143 - iter 66/111 - loss 0.53514081 - samples/sec: 29.93 - lr: 0.012500 2021-01-25 14:24:30,660 epoch 143 - iter 77/111 - loss 0.52793349 - samples/sec: 27.07 - lr: 0.012500 2021-01-25 14:24:36,312 epoch 143 - iter 88/111 - loss 0.52557640 - samples/sec: 31.15 - lr: 0.012500 2021-01-25 14:24:41,967 epoch 143 - iter 99/111 - loss 0.51712182 - samples/sec: 31.14 - lr: 0.012500 2021-01-25 14:24:47,226 epoch 143 - iter 110/111 - loss 0.51848814 - samples/sec: 33.48 - lr: 0.012500 2021-01-25 14:24:47,367 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:24:47,369 EPOCH 143 done: loss 0.5345 - lr 0.0125000 2021-01-25 14:24:47,371 BAD EPOCHS (no improvement): 2 2021-01-25 14:25:22,340 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:25:29,490 epoch 144 - iter 11/111 - loss 0.47319051 - samples/sec: 24.71 - lr: 0.012500 2021-01-25 14:25:36,148 epoch 144 - iter 22/111 - loss 0.45326246 - samples/sec: 26.44 - lr: 0.012500 2021-01-25 14:25:42,148 epoch 144 - iter 33/111 - loss 0.43935597 - samples/sec: 29.35 - lr: 0.012500 2021-01-25 14:25:48,678 epoch 144 - iter 44/111 - loss 0.43911252 - samples/sec: 26.96 - lr: 0.012500 2021-01-25 14:25:55,785 epoch 144 - iter 55/111 - loss 0.44837216 - samples/sec: 24.77 - lr: 0.012500 2021-01-25 14:26:01,543 epoch 144 - iter 66/111 - loss 0.43956897 - samples/sec: 30.58 - lr: 0.012500 2021-01-25 14:26:09,242 epoch 144 - iter 77/111 - loss 0.43752335 - samples/sec: 22.87 - lr: 0.012500 2021-01-25 14:26:15,273 epoch 144 - iter 88/111 - loss 0.43776414 - samples/sec: 29.19 - lr: 0.012500 2021-01-25 14:26:20,794 epoch 144 - iter 99/111 - loss 0.43604807 - samples/sec: 31.89 - lr: 0.012500 2021-01-25 14:26:26,684 epoch 144 - iter 110/111 - loss 0.43161893 - samples/sec: 29.90 - lr: 0.012500 2021-01-25 14:26:26,740 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:26:26,741 EPOCH 144 done: loss 0.4277 - lr 0.0125000 2021-01-25 14:26:26,742 BAD EPOCHS (no improvement): 0 2021-01-25 14:27:01,129 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:27:06,899 epoch 145 - iter 11/111 - loss 0.56361101 - samples/sec: 30.66 - lr: 0.012500 2021-01-25 14:27:13,056 epoch 145 - iter 22/111 - loss 0.44458997 - samples/sec: 28.60 - lr: 0.012500 2021-01-25 14:27:20,346 epoch 145 - iter 33/111 - loss 0.43164483 - samples/sec: 24.15 - lr: 0.012500 2021-01-25 14:27:25,946 epoch 145 - iter 44/111 - loss 0.43714195 - samples/sec: 31.44 - lr: 0.012500 2021-01-25 14:27:34,254 epoch 145 - iter 55/111 - loss 0.47496543 - samples/sec: 21.20 - lr: 0.012500 2021-01-25 14:27:40,520 epoch 145 - iter 66/111 - loss 0.47431087 - samples/sec: 28.10 - lr: 0.012500 2021-01-25 14:27:46,534 epoch 145 - iter 77/111 - loss 0.48796104 - samples/sec: 29.28 - lr: 0.012500 2021-01-25 14:27:52,801 epoch 145 - iter 88/111 - loss 0.49742483 - samples/sec: 28.10 - lr: 0.012500 2021-01-25 14:27:59,150 epoch 145 - iter 99/111 - loss 0.50019007 - samples/sec: 27.73 - lr: 0.012500 2021-01-25 14:28:04,370 epoch 145 - iter 110/111 - loss 0.49637410 - samples/sec: 33.73 - lr: 0.012500 2021-01-25 14:28:04,460 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:28:04,461 EPOCH 145 done: loss 0.4920 - lr 0.0125000 2021-01-25 14:28:04,463 BAD EPOCHS (no improvement): 1 2021-01-25 14:28:37,119 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:28:51,531 epoch 146 - iter 11/111 - loss 0.42418791 - samples/sec: 30.98 - lr: 0.012500 2021-01-25 14:29:00,308 epoch 146 - iter 22/111 - loss 0.44287015 - samples/sec: 20.06 - lr: 0.012500 2021-01-25 14:29:07,156 epoch 146 - iter 33/111 - loss 0.50278946 - samples/sec: 25.71 - lr: 0.012500 2021-01-25 14:29:13,724 epoch 146 - iter 44/111 - loss 0.49937403 - samples/sec: 26.81 - lr: 0.012500 2021-01-25 14:29:20,000 epoch 146 - iter 55/111 - loss 0.48756165 - samples/sec: 28.06 - lr: 0.012500 2021-01-25 14:29:26,176 epoch 146 - iter 66/111 - loss 0.47407933 - samples/sec: 28.51 - lr: 0.012500 2021-01-25 14:29:33,043 epoch 146 - iter 77/111 - loss 0.46273601 - samples/sec: 25.64 - lr: 0.012500 2021-01-25 14:29:38,855 epoch 146 - iter 88/111 - loss 0.47012461 - samples/sec: 30.30 - lr: 0.012500 2021-01-25 14:29:44,808 epoch 146 - iter 99/111 - loss 0.48228383 - samples/sec: 29.58 - lr: 0.012500 2021-01-25 14:29:50,187 epoch 146 - iter 110/111 - loss 0.47029969 - samples/sec: 32.73 - lr: 0.012500 2021-01-25 14:29:50,225 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:29:50,226 EPOCH 146 done: loss 0.4669 - lr 0.0125000 2021-01-25 14:29:50,227 BAD EPOCHS (no improvement): 2 2021-01-25 14:30:31,156 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:30:36,366 epoch 147 - iter 11/111 - loss 0.29897853 - samples/sec: 33.97 - lr: 0.012500 2021-01-25 14:30:41,882 epoch 147 - iter 22/111 - loss 0.37291605 - samples/sec: 31.92 - lr: 0.012500 2021-01-25 14:30:47,949 epoch 147 - iter 33/111 - loss 0.35905749 - samples/sec: 29.02 - lr: 0.012500 2021-01-25 14:30:54,526 epoch 147 - iter 44/111 - loss 0.37039853 - samples/sec: 26.77 - lr: 0.012500 2021-01-25 14:31:02,453 epoch 147 - iter 55/111 - loss 0.37418085 - samples/sec: 22.21 - lr: 0.012500 2021-01-25 14:31:09,132 epoch 147 - iter 66/111 - loss 0.39236568 - samples/sec: 26.36 - lr: 0.012500 2021-01-25 14:31:15,738 epoch 147 - iter 77/111 - loss 0.40967605 - samples/sec: 26.65 - lr: 0.012500 2021-01-25 14:31:21,562 epoch 147 - iter 88/111 - loss 0.41555977 - samples/sec: 30.24 - lr: 0.012500 2021-01-25 14:31:28,806 epoch 147 - iter 99/111 - loss 0.42892227 - samples/sec: 24.30 - lr: 0.012500 2021-01-25 14:31:34,250 epoch 147 - iter 110/111 - loss 0.42682377 - samples/sec: 32.34 - lr: 0.012500 2021-01-25 14:31:34,333 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:31:34,334 EPOCH 147 done: loss 0.4269 - lr 0.0125000 2021-01-25 14:31:34,335 BAD EPOCHS (no improvement): 0 2021-01-25 14:32:07,364 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:32:13,605 epoch 148 - iter 11/111 - loss 0.42012477 - samples/sec: 28.33 - lr: 0.012500 2021-01-25 14:32:19,685 epoch 148 - iter 22/111 - loss 0.44187371 - samples/sec: 28.96 - lr: 0.012500 2021-01-25 14:32:26,969 epoch 148 - iter 33/111 - loss 0.48473765 - samples/sec: 24.17 - lr: 0.012500 2021-01-25 14:32:33,967 epoch 148 - iter 44/111 - loss 0.47316840 - samples/sec: 25.16 - lr: 0.012500 2021-01-25 14:32:40,122 epoch 148 - iter 55/111 - loss 0.46456896 - samples/sec: 28.61 - lr: 0.012500 2021-01-25 14:32:46,083 epoch 148 - iter 66/111 - loss 0.47234904 - samples/sec: 29.54 - lr: 0.012500 2021-01-25 14:32:52,715 epoch 148 - iter 77/111 - loss 0.45225264 - samples/sec: 26.55 - lr: 0.012500 2021-01-25 14:32:58,221 epoch 148 - iter 88/111 - loss 0.48295151 - samples/sec: 31.98 - lr: 0.012500 2021-01-25 14:33:04,084 epoch 148 - iter 99/111 - loss 0.47952042 - samples/sec: 30.03 - lr: 0.012500 2021-01-25 14:33:10,748 epoch 148 - iter 110/111 - loss 0.47041383 - samples/sec: 26.42 - lr: 0.012500 2021-01-25 14:33:10,901 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:33:10,903 EPOCH 148 done: loss 0.4663 - lr 0.0125000 2021-01-25 14:33:10,904 BAD EPOCHS (no improvement): 1 2021-01-25 14:33:44,573 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:33:50,574 epoch 149 - iter 11/111 - loss 0.51968422 - samples/sec: 29.46 - lr: 0.012500 2021-01-25 14:33:56,376 epoch 149 - iter 22/111 - loss 0.43742885 - samples/sec: 30.34 - lr: 0.012500 2021-01-25 14:34:03,080 epoch 149 - iter 33/111 - loss 0.43057959 - samples/sec: 26.26 - lr: 0.012500 2021-01-25 14:34:10,752 epoch 149 - iter 44/111 - loss 0.45822329 - samples/sec: 22.95 - lr: 0.012500 2021-01-25 14:34:17,588 epoch 149 - iter 55/111 - loss 0.48752644 - samples/sec: 25.76 - lr: 0.012500 2021-01-25 14:34:24,161 epoch 149 - iter 66/111 - loss 0.47126709 - samples/sec: 26.78 - lr: 0.012500 2021-01-25 14:34:30,687 epoch 149 - iter 77/111 - loss 0.46427167 - samples/sec: 26.98 - lr: 0.012500 2021-01-25 14:34:36,284 epoch 149 - iter 88/111 - loss 0.45396782 - samples/sec: 31.46 - lr: 0.012500 2021-01-25 14:34:41,459 epoch 149 - iter 99/111 - loss 0.47459639 - samples/sec: 34.03 - lr: 0.012500 2021-01-25 14:34:47,399 epoch 149 - iter 110/111 - loss 0.47931223 - samples/sec: 29.64 - lr: 0.012500 2021-01-25 14:34:47,512 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:34:47,513 EPOCH 149 done: loss 0.5217 - lr 0.0125000 2021-01-25 14:34:47,514 BAD EPOCHS (no improvement): 2 2021-01-25 14:35:18,217 ---------------------------------------------------------------------------------------------------- train mode resetting embeddings train mode resetting embeddings 2021-01-25 14:35:33,221 epoch 150 - iter 11/111 - loss 0.39747888 - samples/sec: 27.58 - lr: 0.012500 2021-01-25 14:35:40,131 epoch 150 - iter 22/111 - loss 0.45994203 - samples/sec: 25.48 - lr: 0.012500 2021-01-25 14:35:47,471 epoch 150 - iter 33/111 - loss 0.46511313 - samples/sec: 23.99 - lr: 0.012500 2021-01-25 14:35:54,065 epoch 150 - iter 44/111 - loss 0.45986518 - samples/sec: 26.70 - lr: 0.012500 2021-01-25 14:36:00,407 epoch 150 - iter 55/111 - loss 0.46384560 - samples/sec: 27.76 - lr: 0.012500 2021-01-25 14:36:07,167 epoch 150 - iter 66/111 - loss 0.45846419 - samples/sec: 26.05 - lr: 0.012500 2021-01-25 14:36:12,460 epoch 150 - iter 77/111 - loss 0.45345524 - samples/sec: 33.27 - lr: 0.012500 2021-01-25 14:36:19,618 epoch 150 - iter 88/111 - loss 0.45261149 - samples/sec: 24.59 - lr: 0.012500 2021-01-25 14:36:25,185 epoch 150 - iter 99/111 - loss 0.43778608 - samples/sec: 31.63 - lr: 0.012500 2021-01-25 14:36:30,687 epoch 150 - iter 110/111 - loss 0.43721506 - samples/sec: 32.00 - lr: 0.012500 2021-01-25 14:36:30,870 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:36:30,871 EPOCH 150 done: loss 0.4436 - lr 0.0125000 2021-01-25 14:36:30,873 BAD EPOCHS (no improvement): 3 2021-01-25 14:39:04,799 ---------------------------------------------------------------------------------------------------- 2021-01-25 14:39:04,802 Testing using best model ...

RuntimeError Traceback (most recent call last)

in () 95 mini_batch_size=16, 96 max_epochs=150, ---> 97 checkpoint=True) 98 12 frames /usr/local/lib/python3.6/dist-packages/transformers/modeling_bert.py in forward(self, input_ids, token_type_ids, position_ids, inputs_embeds) 199 token_type_embeddings = self.token_type_embeddings(token_type_ids) 200 --> 201 embeddings = inputs_embeds + position_embeddings + token_type_embeddings 202 embeddings = self.LayerNorm(embeddings) 203 embeddings = self.dropout(embeddings) RuntimeError: The size of tensor a (759) must match the size of tensor b (512) at non-singleton dimension 1
stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.