Closed SeekPoint closed 3 years ago
@loveJasmine how do you fixed it ? or some suggest
Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json
Thanks a lot
Pengfei Pei R&D -- Best Wishes --
------------------ 原始邮件 ------------------ 发件人: "Kuo @.>; 发送时间: 2021年6月27日(星期天) 中午12:37 收件人: @.>; 抄送: "Bevis @.>; @.>; 主题: Re: [Tencent/NeuralNLP-NeuralClassifier] TypeError: init() missing 2 required positional arguments: 'doc' and 'pos' (#88)
Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json
what does it mean for a invalid json lines? for example?
Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json
what does it mean for a invalid json lines? for example?
It means your file contains lines which are not valid JSON format, like the following example:
>>> import json
>>> invalid_json_str = '{"name": "jack"'
>>> json.loads(invalid_json_str)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 16 (char 15)
In this case, a json.decoder.JSONDecodeError
exception is raised in the dataloader worker process, then the worker process passes exception type and msg to master process. The master process tries to reconstruct the exception object using type and msg, however json.decoder.JSONDecodeError
requires 2 additional arguments doc
and pos
(see source code below) which are missing, so you will see the above exception in the stack trace.
class JSONDecodeError(ValueError):
"""Subclass of ValueError with the following additional properties:
msg: The unformatted error message
doc: The JSON document being parsed
pos: The start index of doc where parsing failed
lineno: The line corresponding to pos
colno: The column corresponding to pos
"""
# Note that this exception is used from _json
def __init__(self, msg, doc, pos):
...
(.venv) D:\ghprj\NeuralNLP-NeuralClassifier>python train.py conf/mul_RNN_train.json Use dataset to generate dict. Size of doc_label dict is 4 Size of doc_token dict is 94363 Size of doc_char dict is 77 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 0 Size of doc_topic dict is 0 Shrink dict over. Size of doc_label dict is 4 Size of doc_token dict is 82235 Size of doc_char dict is 77 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 0 Size of doc_topic dict is 0 Train performance at epoch 1 is precision: 0.915018, recall: 0.902926, fscore: 0.908932, macro-fscore: 0.743833, right: 72700, predict: 79452, standard: 80516. Loss is: 0.162562. ..... test performance at epoch 5 is precision: 0.933566, recall: 0.915440, fscore: 0.924414, macro-fscore: 0.801585, right: 15795, predict: 16919, standard: 17254. Loss is: 0.110784. Epoch 5 cost time: 3227 second Traceback (most recent call last): File "train.py", line 261, in
train(config)
File "train.py", line 228, in train
trainer.train(train_data_loader, model, optimizer, "Train", epoch)
File "train.py", line 102, in train
ModeType.TRAIN)
File "train.py", line 118, in run
for batch in data_loader:
File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch\utils\data\dataloader.py", line 517, in next
data = self._next_data()
File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch\utils\data\dataloader.py", line 1179, in _next_data
return self._process_data(data)
File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data
data.reraise()
File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch_utils.py", line 429, in reraise
raise self.exc_type(msg)
TypeError: init() missing 2 required positional arguments: 'doc' and 'pos'
(.venv) D:\ghprj\NeuralNLP-NeuralClassifier>