t4ngo / dragonfly

ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)
GNU Lesser General Public License v3.0
364 stars 82 forks source link

Feature suggestion: allow for different phrasing of numbers #58

Closed kendonB closed 7 years ago

kendonB commented 7 years ago

For example, I am unable to say "numb one thirty five", and have to say "numb one hundred and thirty five" when using caster. For six digit numbers, it is much simpler to dictate the individual digits.

ref https://github.com/synkarius/caster/issues/174

nihlaeth commented 7 years ago

isn't it trivial to create a grammar rule that allows this?

chilimangoes commented 7 years ago

If there is a trivial way to write a grammar for this, I haven't been able to find it. I think a big part of the problem is the ambiguity in how (at least for English speakers) we say long numbers.

For example, say I have the number 304027. The "correct" way of saying this would be "three hundred and four thousand and twenty seven". It also seems to be the only way to say 304027 and have it be recognized as a single number in a dragonfly grammar. But, in English, we might say something like "three oh four oh two seven" or "thirty forty twenty seven". In dictation mode, Dragon NaturallySpeaking does its best to try to figure out what you meant, using some kind of heuristics. But those same heuristics don't seem to be applied to numbers used in dragonfly grammars.

I spent a fair amount of time at one point last year trying to modify my copy of the numbers.py grammar in Caster to support this. I created a rule definition that was something like "numb [] [] [] []" which worked most of the time but also produced the wrong results fairly often. A couple of the failure modes that I remember were, for example, if you said "thirty forty twenty seven", it might be interpreted as either "30 40 27" or "30 40 20 7" and if you said "two hundred fifteen", it might be interpreted as "215", "200 15", "2 115", or "2 100 15". I remember I tried a variety of things to get around these ambiguity issues, and I was able to resolve some of them, but in the end it was more trouble than it was worth.

@kendonB I think what I might do is just create a rule that allows you to quickly string together a list of single digits 0-9, like the "three oh four oh two seven" example. It's not as flexible, and won't allow you to say things like "thirty forty twenty seven", but it should at least resolve the problem of ambiguity.

nihlaeth commented 7 years ago

let DNS internally handle number complexity, they do that quite well. You can just use an IntegerRef with an alternative digit sequence: <number> | (<digit>)+

Edit: just realised I only covered the last case mentioned by OP.

synkarius commented 7 years ago

Where did that + syntax come from? Does that actually work? I've never seen it before.

Edit: I have seen it before. It was discussed here: https://github.com/t4ngo/dragonfly/issues/15 @t4ngo said it couldn't be done due to limitations in DNS.

nihlaeth commented 7 years ago

It came from natlink source code. It's the same as using a Repetition element, and I used it as a lazy way to indicate that you should. I don't know if dragonfly can handle it, natlink can in any case.

synkarius commented 7 years ago

@nihlaeth I'm looking the NatLinkTalk powerpoint (which along with the VoiceCoders powerpoint and natlink.txt are the best NatLink docs that I am aware of.) On page 29 of the NatLinkTalk powerpoint, I do see that + syntax you're referring to, but as I'm reading the docs, it can only be used to design a spec, not to influence the execution of an action.

IOW, I can create a spec that accepts either one two three or one two two two two two three but both specs will cause the same action to execute. Am I reading this correctly? If not, would you happen to have an example in which the execution of the action is influenced by the repetition?

nihlaeth commented 7 years ago

You could have rule introspect the raw dictation data, and act differently according to the number of twos in there. But you'd have to do the parsing yourself, it's lots easier to use Repetition for that.

nihlaeth commented 7 years ago

I have a fork of natlink online (I plan to rewrite parts of it), you might try reading the Python code, instead of relying on incomplete PowerPoints. There's lots of interesting stuff in there: https://github.com/nihlaeth/dragon_whisperer/blob/master/MacroSystem/core/gramparser.py

synkarius commented 7 years ago

Ah! Yes, introspecting the raw dictation data would work. I hadn't thought of that. So really, all you need are two commands, one that lets you repeat numbers however many times, and one that inspects the dictation data. You make them chainable (via either CCR in Dragonfly or dropping into raw NatLink for access to that + operator), and then parse it yourself. The parsing would be tedious for more complicated use cases, but for numbers it wouldn't be terrible. That's a good idea.

I have periodically skimmed the NatLink source, but never examined it in detail. I'm very glad to see you're looking to update it to 3.6!

synkarius commented 7 years ago

@kendonB my discussion with @nihlaeth is more academic than practical. I suggest you take @chilimangoes's suggestion and just make the 0-9 command. You'd get a lot of mileage out of that for the effort comparable to parsing the raw dictation data even if the latter would cover most if not all of @chilimangoes's aforementioned examples.

nihlaeth commented 7 years ago

I whipped up a quick example. I didn't test it, but it should bring across the point.

from dragonfly import CompoundRule, Repetition, IntegerRef, Text, Choice

class NumberRule(CompoundRule):

    """Numbers in all possible forms."""

    def __init__(self, *args, **kwargs):
        self.spec = "number (<number>|<digits>)"
        self.extras = [
            IntegerRef(name="number", min=0, max=100000),
            Repetition(
                name='digits',
                child=Choice(name='digit', choices={
                    'zero': 0,
                    'one': 1,
                    'two': 2,
                    'three': 3,
                    'four': 4,
                    'five': 5,
                    'six': 6,
                    'seven': 7,
                    'eight': 8,
                    'nine': 9}),
                min=1,
                max=12)
            ]
        CompoundRule.__init__(self, *args, **kwargs)

    def value(self, node):
        if node.has_a_child_with_name('number'):
            return Text(node.get_child_by_name('number').value())
        else:
            return Text(''.join(
                [n.value() for n in node.get_children_by_name('digit')]))

    def _process_recognition(self, node, extras):
        self.value(node).execute()
t4ngo commented 7 years ago

Take a look at the dragonfly.language.Number and NumberRef classes: https://github.com/t4ngo/dragonfly/blob/master/dragonfly/language/base/number.py

That Number class wraps some integer building blocks, allowing a long number to be spoken as "twelve thirty-four five sixty-seven" giving 1234567. The underlying parsing tries to contract the numbers as far as possible, e.g. "thirty four" will always result in 34 instead of 304.

Dragonfly's built-in Number is similar in concept to @nihlaeth's NumberRule above, but it's an element instead of a rule (so easier to reuse) and it is language agnostic (i.e. automatically works for English, German, Dutch, etc. because it uses underlying Integer elements without hard coding words).

In this vein, you might also be interested in the dragonfly.language.en.calendar.Date class which lets you say things like "October 12, 2016", "21 February 2017", "14 days ago", "tomorrow", and "next week Friday". It's the same approach as the Number class, wrapping various human language constructs into an easy-to-use element. Useful for manipulating calendars and such.

kendonB commented 7 years ago

Is someone able to provide a noob-friendly implementation of this? Specifically, what should I change in numbers.py in caster? Either or both of the solutions by @nihlaeth or dragonfly.language.Number would be helpful.

chilimangoes commented 7 years ago

I won't be in front of a computer for a while to be able to test this, but I think it might be as simple as adding from dragonfly.language import Number, NumberRef in numbers.py and then changing the line IntegerRefST("wnKK", 0, 1000000) to use either Number or NumberRef instead of IntegerRefST. If that doesn't work, post your results and any errors you get back on the original issue you opened in the Caster repo and I'll try to help you there.