nttcslab-nlp / Top-Down-RST-Parser

This repository is the implementation of "Top-down RST Parsing Utilizing Granularity Levels in Documents" published at AAAI 2020.
Other
19 stars 5 forks source link

How to apply a trained RST parser to a raw document? #1

Open mufeili opened 3 years ago

mufeili commented 3 years ago

I guess I can get the EDUs of a raw document using a trained model from rstfinder. Are we supposed to use python src/main.py parse for parsing a new raw document? If so, can you provide an example? What kind of input data file should I prepare?

chenzhutian commented 3 years ago

+1, please help

wangwang110 commented 3 years ago

+1, please help

I find the parser input is like this project https://github.com/jiyfeng/DPLP (after "python segmenter ./data"),need to add one column of paragraph id image

image

KobayashiNaoki commented 3 years ago

@mufeili and @chenzhutian Thank you for your interest. I'm the first author of this paper.

@wangwang110 thank you for reply the question. As you said we need a paragraph index to parse a raw document.

@mufeili and @chenzhutian If you can obtain a paragraph index and attach it to the DPLP format, you can parse a raw document. This input format is the same as Two-Stage Parser.

mufeili commented 3 years ago

@mufeili and @chenzhutian Thank you for your interest. I'm the first author of this paper.

@wangwang110 thank you for reply the question. As you said we need a paragraph index to parse a raw document.

@mufeili and @chenzhutian If you can obtain a paragraph index and attach it to the DPLP format, you can parse a raw document. This input format is the same as Two-Stage Parser.

Thank you for your reply and the great work!

YTZ01 commented 7 months ago

Hello, I was wondering is there a limit on the length of the raw document?