Closed LMY-nlp0701 closed 5 years ago
This likely means that the file opened in line 182 does not exist. Can you verify that the file at path mentioned in L182 does really exist ?
Hi, I downloaded, unpacked and uploaded basic_data again. After my investigation step by step, I find the problem is gen_test_ace('wikipedia')(data_gen/gen_test_train_data/gen_ace_msnbc_aquaint_csv.lua line:202)
When I annotate this line(202 line)
I think the program runs successfully, because the generated files and sizes are consistent with those described later.
So the question arises. First of all, I haven't modified any code and documents. Why can the other four corpuses run successfully while Wikipedia can't?
Thank you for your answer.
Hi ! I tried to explore why Wikipedia failed to run.
I made the following modifications to the code. (data_gen/gen_test_train_data/gen_ace_msnbc_aquaint_csv.lua begin with line:182)
Accordingly, I have annotated four other corpora.
Command: th data_gen/gen_test_train_data/gen_ace_msnbc_aquaint_csv.lua -root_data_dir data_path/ The experimental results are shown below.
Results show ZielonaGóra(parliamentary_constituency) does not exist, but I am sure my corpus contains this document.
Thank you for your answer!
Hi ! I think I have solved this problem. After my own step-by-step debugging, I found that there was no mistake in the program itself.
My previous operation
The problem posed: Files with special characters (such as Rome characters) in the folder show exceptions. The folder is basic_data/test_datasets/wned-datasets/wikipedia/RawText
So when you directly decompress on the server(Linux environment), you will not encounter such a bug. I hope other people will take this as a warning.
Finally, I would like to thank the author again for sharing this code, and also for the author's answer. Thank you so much!
Ah, nice catch! Thanks!
Hello! Sorry to disturb you. I encountered some programs bug and I hope you can help me solve this problem.
Program bug
I think I have already done the preliminary work. I made tenth steps:10.Generate all entity disambiguation datasets in a CSV format needed in our training stage: As shown in the picture:
I ran the code and got the following results, but I met some bug. As shown in the picture:
Thank you!