Closed ujjawalmadan closed 3 years ago
Hi. It shows no instances were processed because the data format might be incorrect.
See the file src/prepro/data_builder.py, line 310. When len(dialogue_b_data) == 0, the instance will not be processed. It means the dialogue utterances are all empty and b_data (line 265) is None.
There are two factors that lead to a None b_data when processing an utterance:
The original data that we used are in the Chinese language and the role info is denoted as Chinese characters ("客服" means agent and "客户" means customer). If your custom data is in English and the role info is customized, the role info condition in line 71 should also be modified.
To avoid potential bugs, all role info in data_builder.py should be replaced with your custom role info. For example, replacing "客服" with 'agent' and replacing "客户" with 'customer'.
For the FIleNotFound Error, just rename the training file as "aws8000_alibaba.train.pt".
Hope this can help you.
This helped! Thank you so much!
Hi! Just hoping to get some help on this issue. I ran the code as instructed but was met with this error:
It seems that my custom data was recognized and I put it in the right format I believe. But it was not processed correctly in step 3 of your instructions as it is showing no instances were processed.
In step 4, I am shown this error.
All I have is this in the file directory.
Can you help? Thanks.