finnfiddle / words-to-numbers

JS library to convert textual words to numbers with optional fuzzy text matching
MIT License
248 stars 57 forks source link

Parser has issues with converting spelled out phrases containing #23

Closed guomo closed 6 years ago

guomo commented 6 years ago

If the target string for conversion contains a spelled out number flanked by the word dot, the parser is confusing 'dot' as a decimal and expects a numeral following the last dot. I'm guessing that is the reason :-)

To Reproduce:

$ node
> const words2num = require('words-to-numbers').wordsToNumbers;
> words2num('Dot two Dot')
TypeError: Cannot set property 'end' of undefined
    at matchRegions (/Users/gstone/hacking/node/testing_ground/node_modules/words-to-numbers/dist/parser.js:169:29)
    at exports.default (/Users/gstone/hacking/node/testing_ground/node_modules/words-to-numbers/dist/parser.js:229:17)
    at wordsToNumbers (/Users/gstone/hacking/node/testing_ground/node_modules/words-to-numbers/dist/index.js:21:38)
    at repl:1:1
    at ContextifyScript.Script.runInThisContext (vm.js:44:33)
    at REPLServer.defaultEval (repl.js:239:29)
    at bound (domain.js:301:14)
    at REPLServer.runBound [as eval] (domain.js:314:12)
    at REPLServer.onLine (repl.js:433:10)
    at emitOne (events.js:120:20)

Expected Result: Dot 2 Dot

Note, replacing the last 'dot' with another word works fine.

> words2num('Dot two hello')
'Dot two hello'

IP Addresses There seems to be other situations. For example, imagine ip addresses that previously have been converted to words using another library that parses numbers to words. So, 17.24.12.5 becomes seventeen dot twenty-four dot twelve dot five. Words-to-numbers does:

$ node
> const words2num = require('words-to-numbers').wordsToNumbers;
> words2num('seventeen dot twenty-four dot twelve dot five')
53.5

Looks like it ignored the first parsed 'dots' and added them. I get that in most cases this makes sense, but IP addresses should be special-cased. Perhaps an options flag to avoid the performance hit for the 99% of cases where it wont' be in the data set?

finnfiddle commented 6 years ago

Fixed in v1.5.0

However, I have chosen not to support tens in decimal places yet. Eg. seventeen dot twenty-four becomes seventeen dot two four. If you would like this feature then open another ticket please