remove the nltk POS tagger from convert_rst_discourse_tb.py

EducationalTestingService / rstfinder

Fast Discourse Parser to find latent Rhetorical STructure (RST) in text.

MIT License

123 stars 24 forks source link

remove the nltk POS tagger from convert_rst_discourse_tb.py #23

Open mheilman opened 10 years ago

mheilman commented 10 years ago

Currently, convert_rst_discourse_tb.py uses NLTK's POS tagger to create flat trees for sentences that are in the RST treebank but not the Penn Treebank. This dependency should eventually be removed and replaced with ZPar.

YTZ01 commented 1 year ago

Hello, I have a problem in running this line'''convert_rst_discourse_tb ~/corpora/rst_discourse_treebank ~/corpora/treebank_3'''. I'm wondering the PDTB dataset in your setting is PDTB-v1（2019） or PDTB-v2（2020）, cause I downloaded the dataset from LDC, but it doesn't have a 'parsed' file under it, only data，docs and tools, index.html. Have you met this issue? @mheilman