parser output token alignment with BERT tokens

yzhangcs / parser

:rocket: State-of-the-art parsers for natural language.

https://parser.yzhang.site/

MIT License

838 stars 143 forks source link

parser output token alignment with BERT tokens #99

Closed csarron closed 2 years ago

csarron commented 2 years ago

Hi, this is a great library, thanks for the hard work!

From the demo usage, I wonder if a sentence is tokenized by a BERT tokenizer, is it possible to align the tokens with the SDP parser output?

Any pointers or suggestions are much appreciated!

yzhangcs commented 2 years ago

@csarron Hi,

From the demo usage, I wonder if a sentence is tokenized by a BERT tokenizer, is it possible to align the tokens with the SDP parser output?

I'm sorry that currently the pretrained models only support word-level parsing. Also it might be hard to train a model able to parsing results with inter-word edges from scratch as annotations inside words are lacking in existing datasets.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 30 days with no activity.