spencermountain / compromise

modest natural-language processing
http://compromise.cool
MIT License
11.49k stars 655 forks source link

"Here's", "There's", and "Where's" not being expanded #1127

Closed roschler closed 4 months ago

roschler commented 4 months ago

Usually the compromise library is quite good at detecting contractions and thus creating an extra term that I call "pure explicits". For example, "he's" is followed by a term in the terms array for "is". But for some reason, this does not happen with "here's", "there's", and "where's", where I thought the library would automatically add a following term with the word "is" as the implicit or machine text. Should I expect this?

Note, if you do add support for these words, the library will need to check the subsequent words to see if "is" should be generated, or if "has" should be generated.

spencermountain commented 4 months ago

hey Robert, thanks for the heads-up. this seems to work for me:

const doc = nlp(`here's johnny. there's a catch. Where's my hat?`)
doc.debug()

┌─────────
  │ '[here]'   - Noun, Uncountable
  │ '[is]'     - Verb, Copula, PresentTense
  │ 'johnny'   - Noun, Singular, Person, FirstName, ProperNoun, MaleName

  ┌─────────
  │ '[there]'  - There
  │ '[is]'     - Verb, Copula, PresentTense
  │ 'a'        - Determiner
  │ 'catch'    - Noun, Singular

  ┌─────────
  │ '[where]'  - QuestionWord
  │ '[is]'     - Verb, Copula, PresentTense
  │ 'my'       - Noun, Possessive
  │ 'hat'      - Noun, Singular

can you help me reproduce the cases where the contraction is missing? thanks cheers

roschler commented 4 months ago

Hi Spencer,

I am afraid that we have a "I just saw a ghost" phenomenon here. Upon retesting, I don't see it anymore. Perhaps there was some mistake I made earlier that made it happen or worse, "user error". My apologies.