louismullie / treat

Natural language processing framework for Ruby.
Other
1.37k stars 128 forks source link

String 'r2' being identified as a Symbol thus causing an error #111

Closed agarie closed 7 years ago

agarie commented 9 years ago

I tried to process some text that contained the string r2 and noticed it is causing the following error:

λ pry -rtreat
>> 'regarding'.stem
=> "regard"
>> 'r2'.stem
Treat::Exception: Method stem can't be called on a symbol.
from .../gems/treat-2.1.0/lib/treat/entities/entity.rb:134:in `invalid_call'

I'm not sure where this error is coming from. Can you shed some light on it? I can provide a patch if this isn't expected behavior (and if it is, I'd like to understand the rationale behind it).

louismullie commented 7 years ago

A symbol represents "a character that is neither a word (/^[[:alpha:]-']+$/.), an enclitic ('ll 'm 're 's 't or 've.), a number (/^#?([0-9]+)(.[0-9]+)?$/.) or a punctuation character (e.g. @#$%&*).

This is therefore expected behaviour. You can force it to be a word, if you don't want to use the auto-detection:

require 'treat'
include Treat::Core::DSL
w = word('r2')
w.stem