adhearsion / ruby_speech

A ruby library for TTS & ASR document preparation
MIT License
101 stars 41 forks source link

Feature/more performant number grammar #39

Closed sfgeorge closed 7 years ago

sfgeorge commented 7 years ago

Make the number DTMF grammar more performant for LumenVox. Note that the more re-use we employ with rulerefs, the fewer cycles that their FST has to create.

Opening this for peer review along with the open source community.

sfgeorge commented 7 years ago

@gfaza Can you review this for me?

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling 1cbe3f7534366f27247cea542d227b9f2e94982f on sfgeorge:feature/more-performant-number-grammar into 0c4f434e3ff99a764987869630624872cac39822 on benlangfeld:develop.

benlangfeld commented 7 years ago

Does Lumenvox not have their own pare-compiled built-in grammar for this? These grammars were intended for use with Punchblock's recogniser.

sfgeorge commented 7 years ago

tl;dr Here's a an attempt at a rationale for this PR. Let me know whether or not this sounds reasonable and if so I'm happy to make the little tweaks to make hound-ci happy. Thanks man!

Does Lumenvox not have their own pare-compiled built-in grammar for this? These grammars were intended for use with Punchblock's recogniser.

Good question there. LumenVox does have a builtin for this, but their builtins are compiled and cached on demand, and are optimized no more than any custom grammar that you pass in (this is good). So one's custom grammars are optimized and cached with the same priority as their own builtins, which is nice.

One benefit for using a ruby_speech number dtmf grammar is that for customers who sometimes use ASR and sometimes use DTMF-only (such as myself), there is a consistent number dtmf grammar being used.

Additionally, having more control over your grammars, such as when you use ruby_speech, is especially helpful when there are issues with the builtins provided by vendors. Due to performance problems I encountered, I can share that LumenVox support has recommended to me to stop using their builtin number grammar. They encourage customers to use finite sequences rather than infinite whenever possible. Their builtin number grammar itself is currently inifite and can cause trouble. Now, this PR continues to support an infinite series for a number question, but there are 2 reasons why this is still helpful:

  1. The key component to this PR is the fact that the "digit_series" ruleref is re-used. One might think that the level of abstraction added by rulerefs would slow down a parser. In the case of LumenVox, it is quite the opposite - the parser that generates a FST (Finite State Transducer / Finite State Automaton) only has to analyze a given ruleref once, and subsequent references to that same ruleref are recognized to be the same, preventing the state tree from growing even further. This change alone makes the number grammar perform much better under LumenVox, all without changing
  2. The API for RubySpeech::GRXML::Builtins.number and its use in Punchblock have superb support for passing in future options such as maxlength, much like to how RubySpeech::GRXML::Builtins.decimal already supports maxlength/minlength. It would be straightforward in the future to mitigate parser issues by adding options to specify a finite limit on the integer length and decimal length of a number grammar. The ideal length limit will likely vary widely per customer or application - and even more so why an API-powered grammar builder (achem, ruby_speech) is well-suited to solving this problem rather than a global and untouchable builtin.

To be fair, LumenVox is not the only vendor who recommends/steers toward a finite limit on a number sequence. As seen in the Voxeo docs:

number - Specifies a voice or DTMF input grammar that recognizes numeric input. For Prophecy ASR, the maximum length of detectable digits is 16 when using this built-in grammar.

side note: Rebasing just to make sure that tests still pass following the nokogiri 1.7.0 release. No recent changes to this PR.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling 47d2ee741c2973bfec17447cbaf0b19d242ecf32 on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling 6dddddb65a740bda7020336eb9954d0ea069429f on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling 6dddddb65a740bda7020336eb9954d0ea069429f on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling a6e5e181a012e863c2ff862a6c28f478fb3e6a35 on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling a8c9ca1b728e5156e0567c2bb8539f6a0e389c75 on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.339% when pulling a8c9ca1b728e5156e0567c2bb8539f6a0e389c75 on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.4%) to 98.341% when pulling 6cfe028f0f5642a19ab64da03d71278268885b89 on sfgeorge:feature/more-performant-number-grammar into 4586e30531a4d1bb01e4f4eaa21c0e8c945c942f on benlangfeld:develop.

sfgeorge commented 7 years ago

Rebasing in order to get an accurate code coverage delta.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.8%) to 98.099% when pulling 1d2ceb353541d3dd7c98f8c0820bc20c11384218 on sfgeorge:feature/more-performant-number-grammar into 64a843109202249cf77bdba24e531067f691de9e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling 5e0a208fa28403891eeaf1903d8d348154465ace on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.3%) to 98.099% when pulling 5e0a208fa28403891eeaf1903d8d348154465ace on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.3%) to 98.099% when pulling 5e0a208fa28403891eeaf1903d8d348154465ace on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.3%) to 98.099% when pulling 5e0a208fa28403891eeaf1903d8d348154465ace on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.2%) to 98.229% when pulling fafa9835ce856376cca4e9b931719aa392048c4b on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.8%) to 99.143% when pulling 0830674474f29a45cd9fe8332fcf6750f2e398e0 on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.8%) to 99.143% when pulling 0830674474f29a45cd9fe8332fcf6750f2e398e0 on sfgeorge:feature/more-performant-number-grammar into faa625e6d45227e2c97ac08216e812975693412e on adhearsion:develop.

sfgeorge commented 7 years ago

@benlangfeld What are your thoughts on this PR? Does my reasoning sound legit? 😁

sfgeorge commented 7 years ago

@benlangfeld Is it okay to merge this?

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.032% when pulling 18d054e13b4398932448e483768a4565e4bf0a00 on sfgeorge:feature/more-performant-number-grammar into e2fc590d6bfd8dbdf96a8833b077f88700aba926 on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.088% when pulling ddeafffb74d022d4bfb97b634b0b124c65fa5bac on sfgeorge:feature/more-performant-number-grammar into 0a85675bec3f7c0ab7752d1cc8b758e06f29a193 on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.088% when pulling ddeafffb74d022d4bfb97b634b0b124c65fa5bac on sfgeorge:feature/more-performant-number-grammar into 0a85675bec3f7c0ab7752d1cc8b758e06f29a193 on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.088% when pulling ddeafffb74d022d4bfb97b634b0b124c65fa5bac on sfgeorge:feature/more-performant-number-grammar into 0a85675bec3f7c0ab7752d1cc8b758e06f29a193 on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.088% when pulling ddeafffb74d022d4bfb97b634b0b124c65fa5bac on sfgeorge:feature/more-performant-number-grammar into 0a85675bec3f7c0ab7752d1cc8b758e06f29a193 on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.088% when pulling ddeafffb74d022d4bfb97b634b0b124c65fa5bac on sfgeorge:feature/more-performant-number-grammar into 0a85675bec3f7c0ab7752d1cc8b758e06f29a193 on adhearsion:develop.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 99.088% when pulling ddeafffb74d022d4bfb97b634b0b124c65fa5bac on sfgeorge:feature/more-performant-number-grammar into 0a85675bec3f7c0ab7752d1cc8b758e06f29a193 on adhearsion:develop.

sfgeorge commented 7 years ago

Thank you!