smilejemm / foma

Automatically exported from code.google.com/p/foma
0 stars 1 forks source link

flookup does not backtrack, fails to recognize valid input #33

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Create a binary for regex

     regex [?* a] @-> d;

2. Load it into foma and apply down the following words:
     bf
     aaaaaaaf
     faaaaf
     dddaaaf

3. Send the words through flookup -ix.

What is the expected output? What do you see instead?
  The expected output is `bf` for the first word and `df` for all the others
  The foma interpreter returns the correct results but flookup does not:
    bf -> bf
    aaaaaaaf -> df
    faaaaf -> +?
    dddaaaf -> df (!)

What version of the product are you using? On what operating system?
  foma 0.9.16
  Ubuntu 11.10, 12.04

Please provide any additional information below.

  Looking at the transducer (see attachment), we can see that the paths the above computations should take are

(0) -> (3)                       // bf
(0) -> (1) -> (3)                // aaaaaaaf
(0) ->  2  -> (1) -> (3)         // faaaaf
(0) -> (1) ->  2  -> (1) -> (3)  // dddaaaf

The last sequence is successful again, I assumed it was because of the arc (0) 
-d-> (1), but I was not sure how. So I inserted a line that prints the current 
state at apply.c:910. The results are:

flookup:
-------

$ echo "aaa" | ./flookup -ix ../foo.bin
State no 0
State no 1
State no 1
State no 1
d

$ echo "ad" | ./flookup -ix ../foo.bin
State no 0
State no 1
State no 3
dd
State no 2

$ echo "caa" | ./flookup -ix ../foo.bin
State no 0
State no 3
+?

foma:
----

foma[1]: down caa
State no 0
State no 3
State no 2
State no 1
State no 1
d

foma[1]: down aadd
State no 0
State no 1
State no 1
State no 3
State no 3
ddd
State no 2
State no 2

Apparently, in case of `caa`, flookup does not backtrack from state 3, while 
the interpreter does.

Another interesting bit is the analysis of `ad` and `aadd`, where the 
interpreter moves from state 3 to state 2 after the results have been printed. 
This seems completely superfluous.

Original issue reported on code.google.com by nemesk...@gmail.com on 26 Jul 2012 at 3:55

Attachments:

GoogleCodeExporter commented 8 years ago
I've just realized that state 2 is visited at the end exactly because of 
backtracking. Please disregard the last paragraph in the bug report.

Original comment by nemesk...@gmail.com on 27 Jul 2012 at 9:05

GoogleCodeExporter commented 8 years ago
Nice detailed report, thanks. This looks like it has to do with the binary 
search in apply_down() which was introduced in 0.9.16 as default. If flookup is 
run with the -q or the -I flags, it runs fine. Fix still pending.

Original comment by mans.hul...@gmail.com on 2 Aug 2012 at 9:59

GoogleCodeExporter commented 8 years ago
Fixed in svn.

Original comment by mans.hul...@gmail.com on 2 Aug 2012 at 10:17