emorynlp / nlp4j-tokenization

Tokenize raw texts into tokens and sentences.
Other
6 stars 4 forks source link

tokenization splitting terms with & in them #12

Closed ggiavelli closed 5 years ago

ggiavelli commented 5 years ago

is there a way to tell the tokenizer to keep terms together if they split with &

e.g. How do I confirm my attendance at an L&D event?

I am getting

token:confirm dep:how deptype:advmod token:confirm dep:do deptype:aux token:confirm dep:i deptype:nsubj token:confirm dep:confirm deptype:root token:attendance dep:my deptype:poss token:confirm dep:attendance deptype:dobj token:confirm dep:at deptype:prep token:l dep:a deptype:det token:at dep:l deptype:pobj

ggiavelli commented 5 years ago

was HTTP error. nevermind