Closed rntz closed 3 years ago
Re: duck duck goose.
I think the optimizer is rejecting the duck duck goose case on purpose, for good reason, to prevent exponential parsing time
<duck>: duck+
compiles to a loop around the word duck
user.ducks.0:
0 WORD 'duck'
1 FORK (0, -2)
2 RETURN
user.duckgoose.0:
0 CALL <user.ducks>
1 WORD 'goose'
2 RETURN
To prevent exponential parsing cases, when a loop jumps backwards and forwards at the same time, the forward path is not allowed to visit the backwards jump target without advancing a word
(<duck> <duck>)
is two basic loops around the word duck, which means the second duck will never contain any words, because it is prevented from jumping to the word duck, as it is a descendent of the first duck's dual forward/backward jump. I believe this is correct - the only solution I can imagine is to allow backtracking one word in the second loop once the first loop terminates unsuccessfully, but that's kind of complicated.
There's no "correct" distribution of words between the two ducks anyway. I think the easy answer is you need to design your rules to not put two of the same basic repetition captures in a row without any bridge words.
Fixed the merry christmas bug in v0.1.2 - when recently optimizing list parsing for wav2letter, I introduced a regression where in some cases a list could consume 0 words but not fail that parse path. That's fixed now.
Thanks! I can confirm this fixes the merry christmas bug for me. I am less concerned with the duck duck goose case, since I didn't run into it while writing real code, and as you point out it involves putting two repetition captures in a row, which is not a very sensible thing to do.
My only (mild) concern is that if one did accidentally write some code that looked like duck duck goose without realizing it, it might be hard to debug. (This is what happened with merry christmas; there was some indirection through captures that made it harder to notice.) If the error message said something about adjacent repetitions of the same capture/list, that would make it much easier to figure out the problem with my code. Is it easy to tell if this case is being triggered and change the error message?
No worries if not, and thanks for fixing this so quickly!
No it’s not trivial to detect this case
Two (possibly related) bugs. Let's call them the "duck duck goose" and "merry christmas" bugs. First one, "duck duck goose", involves some interacting captures:
duckduckgoose.py
duckduckgoose.talon
Now, try saying "test duck duck goose".
Expected result: "ducks duckgoose" Actual result:
Ok, now for the merry christmas bug.
merry.py
merry.talon
Now try saying "merry christmas".
Expected result: "MERRY CHRISTMAS" Actual result:
While these bugs are a bit arcane, they are not contrived. I ran into the merry christmas bug while writing actual talon code to do with modifier keys. The code I was writing was incorrect, but I didn't realize this because it triggered the merry christmas bug. I discovered the duck duck goose bug while trying to minimize the merry christmas bug.