Closed luxinyu1 closed 2 years ago
Hi,
You are correct, the example_submission.zip still had several sentence ids which had been removed from the official data. I've corrected this now and checked that it is updated. I'll close the issue, but feel free to open again if there is any problem on your end.
Hi,
You are correct, the example_submission.zip still had several sentence ids which had been removed from the official data. I've corrected this now and checked that it is updated. I'll close the issue, but feel free to open again if there is any problem on your end.
Ummm maybe my expression was not clear enough. Actually the redundant ids in example_submission.zip
is not a very important bug. The key point is, maybe the online scoring script (or data) should be updated because it forces us to submit the sentences which don't exist in current splits of MPQA and this leads to the failure of submissions.
Sorry about that. The online data was updated before, but I had forgotten to update the example_submission.zip. That's why I assumed it was just a problem there. I just downloaded you submission to check, but it seems like you might be working with outdated data. Please pull and rerun process_mpqa.sh to make sure. You should have 2063 sentences in the dev partition (your submission only has 2008 currently).
Let me know if that doesn't work.
Sorry about that. The online data was updated before, but I had forgotten to update the example_submission.zip. That's why I assumed it was just a problem there. I just downloaded you submission to check, but it seems like you might be working with outdated data. Please pull and rerun process_mpqa.sh to make sure. You should have 2063 sentences in the dev partition (your submission only has 2008 currently).
Let me know if that doesn't work.
Sadly, we have checked we are at the newest commit 968b077e241fbe85e4d39f0c9aeef83bcafe72b6
, we even cloned the whole repo and processed the MPQA dataset again. And the dev set still only has 2008 sentences. Have you re-executed the preprocess script? Is this a problem caused by our runtime environment?
I just tried recloning the repo and rerunning the preprocessing script and I have 2063, so it does seem like it could be caused by the runtime env. The first thing that comes to mind is if your version of stanza is the same as the one in the requirements.txt?
I just tried recloning the repo and rerunning the preprocessing script and I have 2063, so it does seem like it could be caused by the runtime env. The first thing that comes to mind is if your version of stanza is the same as the one in the requirements.txt?
The stanza version in our env is 1.3.0
. The README.md in data/
tells us to install stanza >=1.2.3 while the requirements.txt
in baselines/graph_parser
shows the stanza version is 1.1.0. What is the stanza version you use in your preprocessing env?
I just tried recloning the repo and rerunning the preprocessing script and I have 2063, so it does seem like it could be caused by the runtime env. The first thing that comes to mind is if your version of stanza is the same as the one in the requirements.txt?
Maybe you could export a requiements.txt
for the preprocessing scripts if you are using different environments while doing preprocessing and training, otherwise, merging these environments is also a good choice.
Sorry, it should have all been stanza 1.1.1 throughout :/ I will update this now.
Let me know if this solves the problem.
Let me know if this solves the problem.
It seems that this problem has been solved through changing the version of stanza. Thanks for your constant attention.
I'm glad to hear that it has been solved and sorry for the misunderstandings and delays. Hope the rest of the shared task goes more smoothly :)
If that's solved the problem, I'll close the issue for now.
When I am trying to submit
submission.zip
on CodaLab, this error is raised by the online scoring script:After searching these
sent_id
indata/mpqa/dev.json
, I found that these samples are missing, but these ids can still be found inexample_submission.zip
. I am wondering that if it is a bug caused by commite63e80140d8673def09f3471c95790c988d8acd5
which modified the MPQA preprocessing script or I did something wrong while preprocessing this dataset?