spencermountain / compromise

modest natural-language processing
http://compromise.cool
MIT License
11.49k stars 655 forks source link

Number detection can fail when the number starts the text #1124

Closed nikitar closed 3 months ago

nikitar commented 4 months ago

I'm running Compromise 14.13.0 as nlp(text).numbers(). When I run it on "£151", it detects the number fine, but when I try "£151 some other text", it produces an empty array. This seems to be caused by the number being at the beginning of the sentence. Not all follow-up text triggers it though. E.g. it succeeds for "£151 t" and fails for "£151 a".

Is that expected? I tried both in node and in chrome.

spencermountain commented 4 months ago

hey nikitar - thanks, good catch. It looks like we've got a rule that's tagging overly-agressively.

I'll remove it in the next version. For now you can do:

let doc = nlp( `£151 a`)
doc.match('(#Cardinal && #Expression)').tag('Value')
doc.numbers().debug()

which should fix this up. cheers!

spencermountain commented 3 months ago

this is fixed on 14.14.0, cheers!