Closed YianZhang closed 4 years ago
Hi Ian, You need to copy the wsj folder to ~/nltk_data/corpora/ptb/WSJ.
Hi Yikang,
Thanks for the response! I figured that out. However, what is args.data in test_phrase_grammar used for?
Thanks, Ian
It points to the dictionary that the model actually uses.
It points to the dictionary that the model actually uses.
Thanks for the response! Do you mean "directory" or "dictionary"?
Best, Ian
Dictionary
While testing parsing F1, the model still needs to load dictionary from training corpus
Thanks for your prompt response!
After carefully checking your code, I believe the dictionary is loaded from a fixed path: https://github.com/yikangshen/Ordered-Neurons/blob/46d63cde024802eaf1eb7cc896431329014dd869/test_phrase_grammar.py#L279-L282
And args.data is used as the directory of the test data: https://github.com/yikangshen/Ordered-Neurons/blob/46d63cde024802eaf1eb7cc896431329014dd869/test_phrase_grammar.py#L293
Am I correct?
Thanks for your help again! It would be appreciated if you can also check the other issue of mine: #25. As far as I know, this problem also confuses other researchers.
Best, Ian
The code assumes you have the cached dataset in the directory, and it would be cached if the training script was run prior to test_phrase_grammar.py
.
But yes, you are correct.
Thanks a lot!
Hi Yikang and other Contributors,
Thank you for making public the source code! I am trying to reproduce your results, but I am not sure what path to use as the command line argument of test_phrase_grammar --data. I downloaded PTB data and I am currently using treebank_3/parsed/mrg as the data argument. It does not work.
The listings under treebank_3/parsed/mrg: atis brown readme.mrg swbd wsj
The listings under treebank_3/parsed/mrg/wsj:
00 06 12 18 24 01 07 13 19 MERGE.LOG 02 08 14 20 03 09 15 21 04 10 16 22 05 11 17 23
Thank you for your time! Ian