Open shuangqinbuaa opened 6 years ago
Did you ever figure this out? It looks like they use a regular parse tree. But obviously it would be best to parse using the same process they did.
I'm talking about what's the expected method for parsing the input sentences for paraphrasing. To get the output
a person in a black jacket is doing tricks on a motorbike
(ROOT (S (NP (NP (DT A) (NN person)) (PP (IN in) (NP (DT a) (JJ black) (NN jacket)))) (VP (VBZ is) (VP (VBG doing) (NP (NNS tricks)) (PP (IN on) (NP (DT a) (NN motorbike))))) (. .)))
Also I'm curious how to create templates for the generation aspect. They have 10 default templates in the demo script but it would be useful to understand how they created these in order to create new ones.
The Stanford NLP constituency parser seems to work well. Though I am still curious about how to use different templates
sorry for the enormously delayed response! we have added some functions to run on top of the corenlp output to make it easier to get your data into the right format (see extract_parses in read_paranmt_parses.py). @jwieting will soon add a file containing all of the templates in ParaNMT sorted by frequency so you can play around with more of them (in our paper, we use the top 20 most frequently-occurring templates).
Hi, just a friendly reminder, any update on the templates?
Hi @miyyer @jwieting, just a friendly reminder, could you kindly share how the paranmt dataset is preprocessed (tokenizing, BPE, etc.)? Thanks
Hi @miyyer @jwieting, just a friendly reminder, could you kindly share how the paranmt dataset is preprocessed (tokenizing, BPE, etc.)? Thanks
I also want to know the BPE and tokenizing part.
I also want to know about the templates!
If I just want to train the SCPN model, I just need to preprocess the para-nmt dataset. But what if I want to use SCPN to generate syntactically adversarial examples for downstream task? Should I preprocess (for example, tokenizing and BPE) the para-nmt dataset with the downstream task's dataset together? How did you preprocess SST and SICK data ? @miyyer @jwieting Thank you very much!