dictation-toolbox / Caster

Dragonfly-Based Voice Programming and Accessibility Toolkit
https://dictation-toolbox.github.io/Caster/
Other
340 stars 121 forks source link

Text manipulation commands: bug reports, features suggestions, and other discussion #579

Open alexboche opened 5 years ago

alexboche commented 5 years ago

Please report bugs, suggest changes, and discuss other issues relating to the new text manipulation commands that just merged from PR #485.

Check the known issues/bugs here first before reporting a new bug.

Contributions would be welcome in a number of areas, the link above also lists desired features. Somebody that understands regular expressions well could solve a few of the bugs easily (see the link above). Please consider discussing here before making a pull request.

kendonB commented 5 years ago

In Word, starting here: image and saying "go sauce before equals" ends up here: image

kendonB commented 5 years ago

To improve the speed, could we use Ctrl-Right/Ctrl-Left for cursor movements to speed things up? The rules for what types of characters get skipped or stopped at seem the same in most editors. We could also (in-principle) use the up/down arrows for fixed width editors without line wrapping by detecting end line characters, I think.

alexboche commented 5 years ago

Sometimes people will want to search for more than just one letter because there are a lot of occurrences of say "a"so e.g. one might want to say something like "go lease sierra mike". I tried implementing that here by making the choice object the set of sequences up to length 2 of the characters (alphabet and punctuation). There are two problems with this, both of which would be solved if these commands were not CCR (though I think that making these CCR has a lot of advantages). First,
I encountered the grammar complexity error even with sequences of length 2. If we made the commands non-CCR, I'm guessing we could get up to sequences of length 4 without hitting grammar complexity error. Second, there is ambiguity about whether "go lease sierra mike" means go to the nearest occurrence of "sm" or go to the nearest occurrence of "m" and then press s. Both interpretations have situations in which they would be desirable.

kendonB commented 5 years ago

What if we had another word before these so it knows to look for length 2 characters? Like "go sauce 2 double mike sierra" would search for "ms" and "go sauce 2 mike sierra" would search for "m" then type "s"?

alexboche commented 5 years ago

To improve the speed, could we use Ctrl-Right/Ctrl-Left for cursor movements to speed things up? The rules for what types of characters get skipped or stopped at seem the same in most editors. We could also (in-principle) use the up/down arrows for fixed width editors without line wrapping by detecting end line characters, I think.

Good idea. I don't know when I'll have time to implement that. Would anybody else have time soon?

kendonB commented 5 years ago

I found differences in Ctrl-Right/Left behavior across Notepad++, GitHub on Chrome, and TexStudio (all 3 different). Small sample but it seems like there will be a lot of edge cases to deal with

alexboche commented 5 years ago

In the meantime, people could try turning down the keypress wait time in settings.toml though that doesn't seem to be the bottleneck in some places (e.g. in gitter chat)

kendonB commented 5 years ago

I did some quick profiling and found that whenever there's a decent lag it's because of the navigation using the arrow keys.

LexiconCode commented 5 years ago

@alexboche mentioned he was going to be busy for the next few weeks. Therefore I might unpin this issue if no one has a preference.

kendonB commented 5 years ago

It would be neat to be able to integrate these with aliases. Imagine you've got a piece of code with a particular phrase/variable that you have an alias for. For example in the thing I'm currently writing, I have aliased g_{C} with "GC". It would be nice to be able to say "go lease before GC".

alexboche commented 5 years ago

I found differences in Ctrl-Right/Left behavior across Notepad++, GitHub on Chrome, and TexStudio (all 3 different). Small sample but it seems like there will be a lot of edge cases to deal with

Perhaps this could be implemented just for when the target object is dictation rather than characters. If everything is just words, then it should be pretty simple--basically one just counts the number of spaces. (with possibly a few exceptions). characters (e.g. in a programming context) would be much more complicated. I'm not sure how to let the function know that the target object is dictation rather than characters, but there must be a way. In fact, that is something we should be doing anyways because it influences what regular expression is used. Right now, there is a very crude method for determining what regular expression is used, namely if the target object is less than a certain number of characters it is guessed to be characters rather than dictation and then the regular expression is determined accordingly.

The functions have the ability to do different stuff for each application, so one could start by implementing the Ctrl-Right/Left for a particular application and then having the functions do the regular arrow key method for all other applications until those are implemented.

alexboche commented 5 years ago

I was just using these a bit and the move/go commands were mistakenly interpreting some of the variable names as dictation such as before/after and 2nd,3rd as in occurrence_number . When this happens, and it only happens intermittently, it causes the commands to not work. I don't understand what causes the problem to occur versus not. I can think of three things we could try to solve this.

1) introduce new non-English words to play the role of "before" and "after" such as "bef" "aft". This might make the engine more likely to go down the character_sequence path rather than the dictation path. (I don't know if this will work. as I try to test it now, the problem has ceased to occur in the first place even without this.)

2) put in optional or required word such as "bow" in the middle of the spec for the character sequence version of the command and not in the dictation version.

3) We can give different names (i.e. the first word of spec) to the dictation in the character sequence versions of the commands. For example with go/move, we could have the character sequence command be pronounced either go or move but require the dictation version to be only pronounced move. That way users could use the same word for both if they are not running into problems but use different words if they are. If the problem proves to be consistent, we can make different words required.

I think the problem is mostly with the go/move commands since those tend to use the most variables, but this problem could arise for all of the commands. I've hardly gotten around to using these commands much since I just got around to installing them on my own personal branch today. Is anyone else using these commands? Have people experienced this problem? @kendonB

kendonB commented 5 years ago

Can you spell out the exact utterances both that you're attempting and what is getting interpreted? I've had no trouble with misinterpretations like what you describe. I use these commands all the time.

alexboche commented 4 years ago

I don't remember what was causing the problem for me previously, but another user called @Parashoot was reporting similar problems where character names were being interpreted as dictation (e.g. "arch" was being interpreted as the word arch instead of the letter a). Here are the suggestions that I gave them. (Update: sounds like they did not find the suggestions helpful but the problem is resolved on their own. That said I think the heavy handed approaches should work so not sure what went wrong with parashoot; maybe they did not try all of them. Note also that I renumbered the approaches so what was 1 is now 3.) I hope to look into this more fully at some point, but I don't have time now and I just want to put the linked suggestions here in a permanent location. I am also linking to the original conversation in the caster chat where parashoot reported this problem and I discussed it with him as well as comodoro; that being said i don't think looking through that conversation will be very helpful.

Essentially the issue is that the grammar is ambiguous sometimes and how it should be interpreted, and somewhere in the stack something (i think its dragon itself) is not doing a very good job resolving the ambiguity based on the audio input .

Parashoot commented 4 years ago

Hey That’s me, I found that I’ve gotten most of those issues resolved, by no fault of my own. So not sure how it fixed, the suggestions you gave did not do much to help me. But it did correct some of them on its own. @alexboche

alexboche commented 4 years ago

@Parashoot Thanks for reporting back, sorry my suggestions didn't help but glad it's working better now. Just to keep a record of things, it would be useful if you could please confirm whether you tried the approaches where you actually change the underlying code instead of just changing the names of characters or training command names. I previously called the approaches where you change the code 3 and 4, but i am now calling them 1 and 2 and have updated the linked document accordingly. If you didn't try these that's no problem. I'm pretty sure these should work so if they didn't, there must've been a typo in my code.

Parashoot commented 4 years ago

@alexboche I did try change to the escape method but I found that it didn't improve the performance all that much since the program wasn't recognizing the characters properly. I have since had to rebuild my whole solution due to company firewall issues and updates clashing so it has reverted back to where it was before. LIke i mentioned it is able to pick up most things now except for comma, it almost seems as if the ',' character it is looking for is different than the one inserted by typing a comma and it isn't matching because of a different utf character or whatever, not too sure, it could be thinking its actually something like ascii1284 when it should have been ascii1285 for example - not actual values. Regarding using the excape and it matching words within words, this is actually helpful for me when it does it by accident since I have to use a lot of navigation to camelcase variable names in code and don't otherwise use the go to "word" commands

Parashoot commented 4 years ago

Also after taking the suggestion of another user I changed the 'comma' command to ',' in the punctuation and token lists, this appears to have fixed all my problems and sped up my program quite a bit, especially for things like jump in, out, and back, I think it wasn't recognizing the commas and thus was taking longer that it should have to process the characters and execute commands, seems weird to me.

LexiconCode commented 4 years ago

An overhaul of many of the text manipulation commands is in the pipeline mostly in the backend. The clipboard method will be improved upon and it also work with Dragonfly's accessibility API transparently when available. Many bugs should be squashed. Some other really neat features will be included as well. More on that closer to release.

DanKaplanSES commented 4 years ago

I don't know if this is something that castor can fix, but after you use the capital command, the next thing you say will not have a space before it. If I didn't use capital command, the next thing I say would have a space before it.

In other words, if I say, "a b c capital up a d" it will print out like this: "A b cd" I think it should print out like this: "A b c d"

DanKaplanSES commented 4 years ago

can I request that the capital command toggles case by default? In other words if I tell it the search for a capitalized word, it will lower case that first letter

If I say "She said hi capital up she" I want the output to be "she said hi"

DanKaplanSES commented 4 years ago

Imagine you have a sentence like this, "the fox went to the farm."

What I like about grab up, it looks for the previous word each time you say it. If your cursor is at the end of the sentence in you say "grab up the", the first time you say it, it will select the second "the", in the second time you say it, it will grab the first "the". I like this behavior.

But if your cursor was at the beginning of the sentence and you say grab down the, it will find the same word over and over again. I wish it found the next one each time like the grab up does. I would like to request that as a feature

alexboche commented 4 years ago

Unrelated to previous comments:

If anyone is working interested in working on this more, I would suggest directing efforts toward the accessibility api commands rather than the clipboard method. Either by adding new commands to it e.g. for navigating to individual character (sequences) or by extended the range of apps to which it applies. I think the clipboard method is probably close to the limit of what can be achieved with that (though making the cursor move faster in the clipboard would be a big improvement-- slow cursor movement is one of the resaons accessibility is so much better). In the long run, a deep integration method like the accessiblity api is definitely better. Eyetracking should also be part of the navigation toolkit