LearnLib / learnlib

A free, open-source Java library for automata learning algorithms
https://learnlib.de
Apache License 2.0
207 stars 54 forks source link

Bug in NL*: sink states are included #70

Closed Jaxan closed 4 years ago

Jaxan commented 4 years ago

The construction in LearnLib for the canonical RFSA sometimes includes a sink state. This is wrong. A row which does not accept anything, should not be consider to be prime.

I have to agree that the definitions in the Bollig et al paper "Angluin-Style Learning of NFA" (2009) are not very precise on this matter. But in the original work on RFSA by Denis and Eposito ("Residual Finite State Automata", 2002) it is clear that an empty language is not prime:

Note that a prime residual language is not empty and that the set of distinct prime residual languages of a regular language is finite. (page 6)

(This also agrees with the standard notion of join-irreducible from lattice theory: the bottom element is not join-irreducible.) Consequently, the canonical RFSA will never have sink states.

Here is the code to reproduce the bug:

    // language is L = {abbb...} + {baaa...}
    private static CompactNFA<Character> constructSUL() {
        Alphabet<Character> alphabet = Alphabets.characters('a', 'b');

        return AutomatonBuilders.newNFA(alphabet)
                .withInitial("q0")
                .from("q0")
                .on('a').to("q1")
                .on('b').to("q2")
                .from("q1")
                .on('b').to("q1")
                .from("q2")
                .on('a').to("q2")
                .withAccepting("q1")
                .withAccepting("q2")
                .create();
    }

    public static void main(String[] args) throws IOException {
        CompactNFA<Character> target = constructSUL();
        Alphabet<Character> alphabet = target.getInputAlphabet();

        // MQs
        SimulatorOracle<Character, Boolean> mqOracle = new SimulatorOracle<Character, Boolean>(target);

        // EQs
        SampleSetEQOracle<Character, Boolean> eqOracle = new SampleSetEQOracle<>(false);
        eqOracle.addAll(mqOracle, Word.fromCharSequence("a"), Word.fromCharSequence("ab"), Word.fromCharSequence("aa"), Word.fromCharSequence("bab"));

        // construct Learner instance
        NLStarLearner<Character> learner = new NLStarLearnerBuilder<Character>().withAlphabet(alphabet).withOracle(mqOracle).create();

        learner.startLearning();
        NFA<?, Character> hyp;
        while (true) {
            hyp = learner.getHypothesisModel();
            System.out.println("States " + ": " + hyp.size());

            DefaultQuery<Character, Boolean> ce = eqOracle.findCounterExample(hyp, alphabet);
            if (ce == null) break;

            System.out.println("CEX " + ": " + ce.getInput().toString());

            boolean refined = learner.refineHypothesis(ce);
            assert refined;
        }

        // report results
        System.out.println("-------------------------------------------------------");
        System.out.println("States: " + hyp.size());
        System.out.println("Model: ");
        GraphDOT.write(hyp, alphabet, System.out);
    }

Wrong output

States : 3
CEX : «'b'␣'a'␣'b'»
States : 4
-------------------------------------------------------
States: 4
Model: 
digraph g {

    s0 [shape="circle" label="0"];
    s1 [shape="doublecircle" label="1"];
    s2 [shape="doublecircle" label="2"];
    s3 [shape="circle" label="3"];
    s0 -> s1 [label="a"];
    s0 -> s2 [label="b"];
    s1 -> s3 [label="a"];
    s1 -> s1 [label="b"];
    s1 -> s3 [label="b"];
    s2 -> s2 [label="a"];
    s2 -> s3 [label="a"];
    s2 -> s3 [label="b"];
    s3 -> s3 [label="a"];
    s3 -> s3 [label="b"];

__start0 [label="" shape="none" width="0" height="0"];
__start0 -> s0;

}

The state s3 is a sink state, and should be removed altogether. Otherwise, the automaton is correct, and accepts the language L.

mtf90 commented 4 years ago

Sorry for the delayed response. You are right, even by the definitions in the Bollig et al. paper a residual language cannot be empty and hence the "sink" row is not eligible for a prime row.

I was thinking about how to best implement this but your PR already does this nicely. Thank you!