Tencent / NeuralNLP-NeuralClassifier

An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
Other
1.83k stars 402 forks source link

TypeError: __init__() missing 2 required positional arguments: 'doc' and 'pos' #88

Closed SeekPoint closed 3 years ago

SeekPoint commented 3 years ago

(.venv) D:\ghprj\NeuralNLP-NeuralClassifier>python train.py conf/mul_RNN_train.json Use dataset to generate dict. Size of doc_label dict is 4 Size of doc_token dict is 94363 Size of doc_char dict is 77 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 0 Size of doc_topic dict is 0 Shrink dict over. Size of doc_label dict is 4 Size of doc_token dict is 82235 Size of doc_char dict is 77 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 0 Size of doc_topic dict is 0 Train performance at epoch 1 is precision: 0.915018, recall: 0.902926, fscore: 0.908932, macro-fscore: 0.743833, right: 72700, predict: 79452, standard: 80516. Loss is: 0.162562. ..... test performance at epoch 5 is precision: 0.933566, recall: 0.915440, fscore: 0.924414, macro-fscore: 0.801585, right: 15795, predict: 16919, standard: 17254. Loss is: 0.110784. Epoch 5 cost time: 3227 second Traceback (most recent call last): File "train.py", line 261, in train(config) File "train.py", line 228, in train trainer.train(train_data_loader, model, optimizer, "Train", epoch) File "train.py", line 102, in train ModeType.TRAIN) File "train.py", line 118, in run for batch in data_loader: File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch\utils\data\dataloader.py", line 517, in next data = self._next_data() File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch\utils\data\dataloader.py", line 1179, in _next_data return self._process_data(data) File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data data.reraise() File "C:\Users\AppData\Roaming\Python\Python37\site-packages\torch_utils.py", line 429, in reraise raise self.exc_type(msg) TypeError: init() missing 2 required positional arguments: 'doc' and 'pos'

(.venv) D:\ghprj\NeuralNLP-NeuralClassifier>

q294881866 commented 3 years ago

@loveJasmine how do you fixed it ? or some suggest

MagiaSN commented 3 years ago

Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json

q294881866 commented 3 years ago

Thanks a lot  

Pengfei Pei   R&D -- Best Wishes --

 

------------------ 原始邮件 ------------------ 发件人: "Kuo @.>; 发送时间: 2021年6月27日(星期天) 中午12:37 收件人: @.>; 抄送: "Bevis @.>; @.>; 主题: Re: [Tencent/NeuralNLP-NeuralClassifier] TypeError: init() missing 2 required positional arguments: 'doc' and 'pos' (#88)

Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

tongcu commented 2 years ago

Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json

what does it mean for a invalid json lines? for example?

MagiaSN commented 2 years ago

Usually this error means the dataset file contains some invalid json lines, you should check whether each line is a valid json

what does it mean for a invalid json lines? for example?

It means your file contains lines which are not valid JSON format, like the following example:

>>> import json
>>> invalid_json_str = '{"name": "jack"'
>>> json.loads(invalid_json_str)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 16 (char 15)

In this case, a json.decoder.JSONDecodeError exception is raised in the dataloader worker process, then the worker process passes exception type and msg to master process. The master process tries to reconstruct the exception object using type and msg, however json.decoder.JSONDecodeError requires 2 additional arguments doc and pos (see source code below) which are missing, so you will see the above exception in the stack trace.

class JSONDecodeError(ValueError):
    """Subclass of ValueError with the following additional properties:

    msg: The unformatted error message
    doc: The JSON document being parsed
    pos: The start index of doc where parsing failed
    lineno: The line corresponding to pos
    colno: The column corresponding to pos

    """
    # Note that this exception is used from _json
    def __init__(self, msg, doc, pos):
        ...