naver / sqlova

Apache License 2.0
632 stars 168 forks source link

pickle problem in train #2

Closed farsmile closed 5 years ago

farsmile commented 5 years ago

Hi guys: Everything before training goes well. However, when i got in epoches, problem is as following. Have I do something wrong?

Microsoft Windows [版本 10.0.17134.556] (c) 2018 Microsoft Corporation。保留所有权利。

(venv) C:\PycharmProjects\sqlova>python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_le ng 222 BERT-type: uncased_L-12_H-768_A-12 Batch_size = 32 BERT parameters: learning rate: 1e-05 Fine-tune BERT: True vocab size: 30522 hidden_size: 768 num_hidden_layer: 12 num_attention_heads: 12 hidden_act: gelu intermediate_size: 3072 hidden_dropout_prob: 0.1 attention_probs_dropout_prob: 0.1 max_position_embeddings: 512 type_vocab_size: 2 initializer_range: 0.02 Load pre-trained parameters. Seq-to-SQL: the number of final BERT layers to be used: 2 Seq-to-SQL: the size of hidden dimension = 100 Seq-to-SQL: LSTM encoding layer size = 2 Seq-to-SQL: dropout rate = 0.3 Seq-to-SQL: learning rate = 0.001 Traceback (most recent call last): File "train.py", line 591, in dset_name='train') File "train.py", line 211, in train for iB, t in enumerate(train_loader): File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 822, in iter return _DataLoaderIter(self) File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 563, in init w.start() File "C:\Python36\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Python36\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Python36\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "C:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child) File "C:\Python36\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'get_loader_wikisql..'

(venv) C:\PycharmProjects\sqlova>Traceback (most recent call last): File "", line 1, in File "C:\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Python36\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input

(venv) C:\PycharmProjects\sqlova>

whwang299 commented 5 years ago

Hi @farsmile

The problem seems to be caused by lambda function in torch.utils.data.DataLoader

https://github.com/naver/sqlova/blob/b7ce9ad421fd4688ef8592f93b248df85e9995ad/sqlova/utils/utils_wikisql.py#L91-L98

which is, according to this link, the problem between pytorch and Windows. (Sorry I couldn't test this by myself as I don't have Window machine with GPU).

Using a custom data_loader function may solve this problem.

Thanks.

Wonseok

farsmile commented 5 years ago

Hi Wonseok: it works now and your link is helpful!

Thanks.