Closed pmarcis closed 1 year ago
Hi, you can use lambeq's SpasyTokeniser class to tokenise your sentences before feeding them to the parser. From the command line interface, you can just use the -t
option. If you want to provide the sentence already tokenised, be sure to separate the words correctly, i.e. "did" and "n't", as below, otherwise the model will not recognise "didn" as a proper word.
Hope that helps.
Thanks! That solves this problem!
Hi!
When passing tokenised data containing English contractions, the parser crashes. Passing non-tokenised data seems wrong as the parser does not perform tokenisation internally (all punctuation gets attached to words, contractions are attached to the verb).
E.g.:
results in: