Closed yunhaoli1995 closed 3 years ago
Hi,
Thanks for reporting this issue! Let me know if I'm mistaken, but it seems that the actual error is at line 157 due to the assert False
statement?
In this case, I will push a fix and simply remove the assert
statement that's causing the trouble. This bugged line comes directly from the original repo, so I cannot be sure 100% why it's included. I can only assume that it was either debugging code or was never raised (it's actually possible, since not one of the generated texts I've evaluated my-self using this repo has raised the issue).
Note that this fix is also used by Ratish Puduppully's code in his fork, so I'm confident in applying this change here.
Can you pull the updated code and let me know if this issue is resolved?
Thanks, Clément
Hi,
Thanks for reporting this issue! Let me know if I'm mistaken, but it seems that the actual error is at line 157 due to the
assert False
statement?In this case, I will push a fix and simply remove the
assert
statement that's causing the trouble. This bugged line comes directly from the original repo, so I cannot be sure 100% why it's included. I can only assume that it was either debugging code or was never raised (it's actually possible, since not one of the generated texts I've evaluated my-self using this repo has raised the issue).Note that this fix is also used by Ratish Puduppully's code in his fork, so I'm confident in applying this change here.
Can you pull the updated code and let me know if this issue is resolved?
Thanks, Clément
Thanks for your reply, this issue doesn't occurs when remove the assert statement. But another issue occurs when i compute RG scores and generate the list of extracted records using the command:
python run.py \ --just-eval \ --datafile $ROTOWIRE/output/training-data.h5 \ --preddata $ROTOWIRE/output/prep_predictions.h5 \ --eval-models $ROTOWIRE/models \ --gpu 0 \ --test \ --ignore-idx 15 \ --vocab-prefix $ROTOWIRE/output/training-data
The error is as follows:
xception has occurred: IndexError (note: full exception trace is shown but execution is paused at: _run_module_as_main) index out of range in self
I went to the debug mode and find that the embdding size of the model is smaller than the token ids. The embdding size of the model is 4934, while the vocab size is 5395. I think there are something wrong with the download dataset. Can you tell me how did you make the json directory?
I think I know what's happening and it is not due to a mistake on your end.
The models I shared were trained using an earlier version of the code, which has slightly changed and now the extracted vocabulary is not the exact same. When I run the step to build the training-data.h5 file, I get the same vocabulary size as you (5395), which differs from the older files that I have locally.
I don't have time right now to re-train models with the updated version. I will train them as soon as possible and will let you know once they are available. In the mean time, you can follow instructions to train you own models.
If you can wait, you can expect models available for download in the following days.
Bests, Clément
I think I know what's happening and it is not due to a mistake on your end.
The models I shared were trained using an earlier version of the code, which has slightly changed and now the extracted vocabulary is not the exact same. When I run the step to build the training-data.h5 file, I get the same vocabulary size as you (5395), which differs from the older files that I have locally.
I don't have time right now to re-train models with the updated version. I will train them as soon as possible and will let you know once they are available. In the mean time, you can follow instructions to train you own models.
If you can wait, you can expect models available for download in the following days.
Bests, Clément
Thanks for your reply! Now i'm trying to train my own models.
When I evaluate with my own model, the same error occurs again:
exception has occurred: IndexError (note: full exception trace is shown but execution is paused at: _run_module_as_main) index out of range in self
After debugging, I find that this time the error is related to the embedding size of entdist, the max entidist of train dataset is 191 while the max entidist of test set is 195. Besides, the max numdist of test set is larger than train dataset too. Now, my solution is to manually add 10 to the embedding size of entdist and numdist before training the model:
nlabels = train['labels'].max().item() + 1
ent_dist_pad = train['entdists'].max() + 10
num_dist_pad = train['numdists'].max() + 10
word_pad = train['sents'].max() + 1
These codes are from data.py
I am having the same error. I need to find out which commit introduced the bug, and fix it. It will take some time, I will let you know once everything is back to normal.
Thanks for your help in this!
I have found the origin of the bug, which is entirely my fault, and was not introduced by a commit! It's fixed now, and you should be able to train and use your trained models without changing anything in the code.
If you are curious, refer to line 81 of the original code where the order of operations is clamp test then shift train/val/test whereas I previously shifted train before clamping test. It resulted in a slight mismatch which was not always problematic, hence why I didn't even notice it before now.
Note that I have also added something so that there is more consistency across runs: previously, running data_utils.py
to create training data was not deterministic, and train-data.labels
were in random order. Now, they should always be in the same order!
As a sanity check, you can check that the firt line of training-data.labels
is None 1
and that training-data.dict
has 5395 lines, and its line 5394 is Celitcs 5394
(last line should be UNK
).
The issue should be resolved, but I'm leaving this open until I find time to train models and upload them. Let me know if there are other issues (if there is an issue unrelated to this one, please open a new issue).
Thanks again, Clément
Hi,
I have trained 6 new models, and everything seems to be working fine on my end.
I am closing this issue, feel free to reopen if needed.
Have a nice day, Clément
Hi,
I have trained 6 new models, and everything seems to be working fine on my end.
I am closing this issue, feel free to reopen if needed.
Have a nice day, Clément
Thank you so much!
Hi, in order to evaluate the generated text with run.py, i have to created the sub-directory output, but i meet a problem when running the following command:
An error occurs as follows:
I download the json file into the json sub-directory from the original github of Challenges in Data-to-Document Generation. Do you have any idea?
Thanks