Closed keien closed 10 years ago
As you can see from the live demo (try something like This is a sentence (this is one too).
), -RRB-
and -LRB-
come from the java parser itself.
One option is to do a string replacement to replace those strings with (
and )
respectively.
Inputting ()<>[]{}
will show that everything aside from the angle braces gets remapped. Is there a way to figure out what else might get remapped?
Yes, here are the remap rules: http://www.cis.upenn.edu/~treebank/tokenization.html
Looks like there's a way to disable it. I've pushed a new version of stanford-corenlp-python to the repository, go ahead and install it/try it.
Looks good.
I'll push to pypi.
The parser turns parenthesis into -RRB- and -LRB-. I assume similar things happen for other brackets, which might be a problem when it comes to reconstruction.