spencermountain / compromise

modest natural-language processing
http://compromise.cool
MIT License
11.49k stars 655 forks source link

`replaceWith('')` call with a capture group at beginning of match throws error #1150

Closed Fdawgs closed 1 month ago

Fdawgs commented 1 month ago

Node version: 20.17.0 Compromise version: 14.14.1

From looking through the commits to 14.14.1 this may have been introduced with #1146?

When calling match().replaceWith('') after calling it once already with a capture group at the beginning causes it to throw:

TypeError: Cannot read properties of undefined (reading 'length')
    at T.replaceWith (/home/fsmith/application/node_modules/compromise/builds/three/compromise-three.cjs:1:11511)
    at Object.<anonymous> (/home/fsmith/application/tests/scratch/feat-18244.js:10:23)
    at Module._compile (node:internal/modules/cjs/loader:1469:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1548:10)
    at Module.load (node:internal/modules/cjs/loader:1288:32)
    at Module._load (node:internal/modules/cjs/loader:1104:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:174:12)
    at node:internal/main/run_main_module:28:49

Example:

'use strict'

/** @type {import('compromise').default} */
const nlp = require('compromise')

const text = 'to the window'

// Throws an error, capturing group at beginning, replacing with an empty string
try {
  const doc1 = nlp(text)
  doc1.match('[to] the window', 0).replaceWith('by')
  doc1.match('[by]', 0).replaceWith('')
} catch (e) {
  console.error('doc1', e)
}

// Works, capturing group at beginning, replacing with a non-empty string
const doc2 = nlp(text)
doc2.match('[to] the window', 0).replaceWith('near')
doc2.match('[near]', 0).replaceWith('by')
console.log('doc2', doc2.text())

// Works, capture group at end, replacing with an empty string
const doc3 = nlp(text)
doc3.match('to the [window]', 0).replaceWith('wall')
doc3.match('[wall]', 0).replaceWith('')
console.log('doc3', doc3.text())
spencermountain commented 1 month ago

Agh, rats. Will take a look, thanks.

On Wed, Oct 9, 2024 at 10:38 AM Frazer Smith @.***> wrote:

Node version: 20.17.0 Compromise version: 14.14.1

From looking through the commits to 14.14.1 this may have been introduced with #1146 https://github.com/spencermountain/compromise/pull/1146?

When calling match().replaceWith('') after calling it once already causes it to throw:

TypeError: Cannot read properties of undefined (reading 'length') at T.replaceWith (/home/fsmith/mbi.luna.rova/node_modules/compromise/builds/three/compromise-three.cjs:1:11511) at Object. (/home/fsmith/mbi.luna.rova/tests/scratch/feat-18244.js:10:23) at Module._compile (node:internal/modules/cjs/loader:1469:14) at Module._extensions..js (node:internal/modules/cjs/loader:1548:10) at Module.load (node:internal/modules/cjs/loader:1288:32) at Module._load (node:internal/modules/cjs/loader:1104:12) at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:174:12) at node:internal/main/run_main_module:28:49

Example:

'use strict' /* @type {import('compromise').default} /const nlp = require('compromise') const text = 'to the window' // Throws an error, capturing group at beginning, replacing with an empty stringconst doc1 = nlp(text)doc1.match('[to] the window', 0).replaceWith('by')doc1.match('[by]', 0).replaceWith('') // Works, capturing group at beginning, replacing with a non-empty stringconst doc2 = nlp(text)doc2.match('to the [window]', 0).replaceWith('wall')doc2.match('[wall]', 0).replaceWith('door') // Works, capture group at end, replacing with an empty stringconst doc3 = nlp(text)doc3.match('to the [window]', 0).replaceWith('wall')doc3.match('[wall]', 0).replaceWith('')

— Reply to this email directly, view it on GitHub https://github.com/spencermountain/compromise/issues/1150, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADBSKNWBHTP3XESFYHMM4DZ2U5WTAVCNFSM6AAAAABPUWNX5WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU3TMMJTGQ4DGMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Fdawgs commented 1 month ago

Thanks @spencermountain! Have updated the example to make it easier to see.

spencermountain commented 1 month ago

got a fix for the runtime error released as 14.14.2 thanks for the heads-up, let me know if you see anything else cheers

Fdawgs commented 1 month ago

Thanks Spencer, that resolved it!

Fdawgs commented 2 weeks ago

got a fix for the runtime error released as 14.14.2 thanks for the heads-up, let me know if you see anything else cheers

Spotted another one as part of this, in the example above change const text = 'to the window' to const text = '- to the window' and it will throw a runtime error as well. Any sort of punctuation mark at the beginning will cause the error to be thrown.

This is occuring in 14.14.2.

spencermountain commented 1 hour ago

thanks - will take a look at this on the weekend cheers