adl / hoaf

Hanoi Omega-Automata Format
14 stars 2 forks source link

Add support for alternating automata. #2

Closed adl closed 10 years ago

adl commented 10 years ago

This is a first attempt at specifying alternating automata for discussion.

The addition of disjunct destinations also enables, even on non-alternating automata, some syntactic fusion of edges that have the same source, label, and acceptance sets (but different destination). It's not clear to me if this is welcome, or if this is just some additional burden on the authors of the parsers. Should authors of tools that do not care about alternating automata really bother about supporting disjunct destinations?

One feature I have left out is the notion of "empty" destination, usually represented by "true" in the state-formula representing the destination. I can see that this helpful when handling states of alternating automata algorithmically (because "true & x" simplifies to "x"), but on the other hand:

xblahoud commented 10 years ago

Thank you Alexandre for the draft. I like the examples you prepared.

The addition of disjunct destinations also enables, even on non-alternating automata, some syntactic fusion of edges that have the same source, label, and acceptance sets (but different destination). It's not clear to me if this is welcome, or if this is just some additional burden on the authors of the parsers. Should authors of tools that do not care about alternating automata really bother about supporting disjunct destinations?

This is a very good remark. I thing that preparing two versions is not a bad idea. I can see three degrees of alternating transitions support:

  1. none
    • it supports only deterministic and nondeterministic automata
    • the easiest to parse (and we don't need to introduce "[" and "]" - but I prefer to use them anyway)
  2. conjuncts only
    • supports also universal automata and alternating automata in DNF.
    • parsing is easier - no parenthesis "(" and ")" needed in destination formula
  3. positive boolean combinations over states
    • supports all alternating automata
    • difficult to parse

My opinion is that it makes sense to have one format as simple as possible (case 1) for those who care only about nondeterministic and deterministic automata, and it's extension, that supports universal and alternating automata (for example named HOA: v1a, a as alternating ). I leave it open for the discussion whether it should be case 2 or 3. We plan to do some research on AA with positive boolean combinations and AA in DNF, so we will need the ability to store these eventually anyway. On the other hand, I see that the easier the implementation is, the more likely will others support such format.

One feature I have left out is the notion of "empty" destination, usually represented by "true" in the state-formula representing the destination.

  • one problem with implicit states is that their membership to the various acceptance sets is then unclear (this implicit state obviously has to be accepting, but imagine a complex acceptance condition with disjunctions, and hence multiple ways to be satisfied).

I see this as a good reason to do so. Even though it will cause some inconveniences (for example to LTL2,3BA, LTL3DRA) introducing possibility of unclear interpretation is even worse.

adl commented 10 years ago

Regarding the three degree of alternation support you mention, I don't really like the idea of having multiple variants of the the format. I'd rather have a unique general format, and let tools support only a subset of it. (1&2&3) is not more difficult to parse than (1 2 3), so if a tool supports only conjunctions of destination states, I don't think it deserves a variant of the format: it can work with (1&2&3) and report an error if the automaton contains a destination like (1&2&(3|(4&5))). If a tool does not support alternation, then it should still be able to read a destination such as 1 or (1), and report an error when if it reads (1&2&3).

My problem was that, just as a transition going to (1&2&(3|(4&5))) can easily be rewritten as two transitions going to (1&2&3) and (1&2&4&5) in an alternating automaton, a transition going to (1|2) could also be rewritten as two transitions going to (1) and (2) in a non-deterministic automaton. In some way, it feels like the freedom added by the support of disjunction in alternating automata is making parsing non-alternating automata more difficult (assuming the tool author would like to accept all ways to represent non-alternating automata).

Maybe we should simply add a recommendation to use disjunctions only in alternating automata?

All this discussion would be non-existent if disjunctions were disallowed. Given that a transition to (1&2&(3|(4&5))) can be represented as two transitions to (1&2&3) and (1&2&4&5), its not very clear to me why disjunction is so important. If you have a tool that supports disjunctions internally, I believe it is in your best interest to make a first pass to group transitions with the same label and acceptance sets, so reading representation with conjuncts only does not seem to be a problem.

Reading this last paragraph, it sounds like I'm against disjunctions, but that is not really the case. I'm just noticing that they give more way to express non-determinism, therefore making the format more complex, and I'm trying to figure out if this is worth it.

strejcek commented 10 years ago

I'll start with the reminder that alternation support is not just about the conjunction/disjunction but also about the "true" transitions discusses in the first Alexandre's comment on alternation support. Support of "true" state-formulae is something that cannot be reduced to conjunctions of standard states.

I tend to make two levels of the format:

  1. basic = support of nondeterministic/deterministic automata (basically what we have now)
  2. advanced = support of alternating automata

Nearly all tools would produce/consume only the basic automata and would be happy with that (keeping their parsers pretty simple). There are only few tools that need to handle alternating automata and these do not care whether the format allows both | and & or just & (the difference is tiny in the light of complexity of handling alternating automata). I would slightly prefer the more general case and support disjunction as well (recall again that our general philosophy is to make only necessary restrictions).

adl commented 10 years ago

[Disclaimer: it's 2 o'clock in the morning.]

I'll start with the reminder that alternation support is not just about the conjunction/disjunction but also about the "true" transitions discusses in the first Alexandre's comment on alternation support. Support of "true" state-formulae is something that cannot be reduced to conjunctions of standard states.

Could you elaborate on the last sentence? It's really not obvious to me, so probably I am missing something. Isn't a true state-formula equivalent to jumping to an all-accepting true state?

Most of the old definitions of alternating automata I've read seem to partition the set of states as either existential or universal (hence the name of alternating). In such a setup, I think what we call a "true" state-formula would correspond to some universal state with no successor. However when I look (for instance) at the definition of alternating automata in the Miyano-Hayashi paper, the runs over these automata are clearly presented as trees with no leaf, so having no successor is not an option in their definition.

In the more moderns definitions, using Boolean formulae for destination, my impression is that the use of a true state is just a mean to simplify computations (and mathematical exposure) by automatically discarding some infinite branches that would be accepting. But as I was saying in my very first comment: such true states can be detected easily even when they are explicit, while they cause some problems when they are implicit.

The more we discuss alternating automata, the more I think we went overboard. To support alternation, my view is that we just need to support two things: existential transitions and universal transitions. We already have existential transitions because the format is built for existential automata, so the only thing we should add are universal transitions. Any additional feature seems superfluous and will make the format more complex to specify and implement. I really dislike the fact that supporting disjunction (or even true if my understanding of it is sound) is just giving more ways to express something we can already support.

strejcek commented 10 years ago

Sorry for making you doubting about your (correct) view on true, I should be more precise when writing the previous comment. Of course, you can translate an automaton with true to an equivalent automaton with an explicit state instead. But I believe this is something we do not want to do for several reasons:

  1. Instead of writing just true or t, you have to make an explicit state, remember its number, add redirect true transitions to this number, add cyclic transitions on this state and take care about the acceptance on this state. This is quite a lot, don't you think?
  2. This modification of an automaton has an impact on several statistics like number of states, number of transitions and maybe others. And it may have some slight influence on performance of tools using alternating automata.
  3. This arrangement goes against the philosophy of the format. Of course, it is enough to support just labels on transition and/or just Muller acceptance because everything can be easily reduced to that. But we go in the opposite direction: we support more than many standard definitions. In the case of alternating automata, we should support at least as much as standard (maybe I should say modern standard) definitions.

Alexandre, can you please explain us what complications can the support of true bring? Is it just the question "what should we do if someone uses it in a nonalternating automaton?" Or is there any parsing problem? The same question applies to general boolean formulae as transition targets or initial states. I do not think they are more complicated to parse than boolean formulae describing labels which we plan to support. Can the problem be solved by a general statement that as soon as some uses a boolean combination of states (or true), then the automaton is automatically seen and handled as alternating? Of course, we can add some reduction of such automata to old-fashioned alternating automata with universal and alternating nodes. Or to something else.

adl commented 10 years ago

Of course, you can translate an automaton with true to an equivalent automaton with an explicit state instead. But I believe this is something we do not want to do for several reasons:

  • Instead of writing just true or t, you have to make an explicit state, remember its number, add redirect true transitions to this number, add cyclic transitions on this state and take care about the acceptance on this state. This is quite a lot, don't you think?

Hmm..., no, I do not think this is a lot. In fact, I'm not really concerned about this being a problem when we output automata. As you need only one self-loop, the true state only appears as one line in the file:

State: 42 {0} [t] 42

Remembering that 42 is your true state when outputting your automaton cannot be considered a problem, as you also have to remember the number of your other states as well.

As for specifying the acceptance of the true state, this was one of the first reasons (see my first message on this issue) for not trying to support t as a destination. I understand how the implicit true state works nicely with co-Büchi acceptance, but when I tried to imagine defining its semantics for any kind of acceptance condition it did not look very clear to me. It's now clearer in my mind, and I don't consider acceptance of the "vanishing true" state to be an issue either way.

  • This modification of an automaton has an impact on several statistics like number of states, number of transitions and maybe others. And it may have some slight influence on performance of tools using alternating automata.

I don't see any performance effect. Outputting an additional line is not a problem, recognizing "accept all" states on input is not a problem either. I would even point out that forcing "accept all" states to be recognized on input can be a performance improvement if you are reading an automaton that contains several such states, as those will be merged if you represent them with "true".

Statistics will always depends on the way the tool you use has decided to represent the automaton internally (consider for instance what different people count as "transitions"). If we decide to add "true" in the format, but that someone write a tool that does not support "true" and have to convert this to an actual state internally, the statistics will be different as well. I don't think we have to be concerned by this problem when defining the format: if you want consistent statistics, simply compute them with the same tool.

  • This arrangement goes against the philosophy of the format. Of course, it is enough to support just labels on transition and/or just Muller acceptance because everything can be easily reduced to that. But we go in the opposite direction: we support more than many standard definitions. In the case of alternating automata, we should support at least as much as standard (maybe I should say modern standard) definitions.

I would like to be general, while introducing the least number of concepts possible, because I'm concerned by implementation of the tools that read the format. The more concepts we introduce, the harder it is to build a tool that does not have any restriction on its input, and the harder it will be to have interoperable tools.

So this is really a trade-off between the ease of writing the format, and the ease of reading it. (By reading and parsing, I mean "by a program" — I'm not considering humans that read/write this format, but I'm concerned by humans who implement programs that read/write this format.)

My (longly ruminated) suggestion is to support only conjunctions of destination states (no true, no disjunction, no parentheses).

state-formula ::= INT | state-formula "&" state-formula

I have two main reasons for this:

  1. it is enough to support the modern definitions of alternating automata: we already have disjunction via non-determinism, true states can be represented explicitly (and are easy to recognize by tools who need them), so this require very little work from the writer
  2. if a tool produces an alternating automaton that does not use any conjunction, this automaton can be read directly as a non-deterministic automaton, and used even by tools that do not support alternating automata. So this is good for interoperability.

The last point I consider a very nice property: the introduction of alternation in the format does not require any additional work in tools that only support non-determinist automata in order to read alternating automata that do not use conjunct destination. Furthermore, the addition to the format becomes very simple to describe.

If we support disjunction and true state: this is not the case anymore. Suddenly, by the addition of | and t, we have more ways encode non-deterministic automata; so because the format support alternating automata, and even if your tool do not understand alternating automata, you have to add support for | and t if you want to maximize interoperability.

Alexandre, can you please explain us what complications can the support of true bring? Is it just the question "what should we do if someone uses it in a nonalternating automaton?"

Yes, same for |.

Or is there any parsing problem? The same question applies to general boolean formulae as transition targets or initial states. I do not think they are more complicated to parse than boolean formulae describing labels which we plan to support.

I don't have a problem with the grammar. If I expressed concerns about parsing, it's because we would have more ways to parse non-determinism.

Can the problem be solved by a general statement that as soon as some uses a boolean combination of states (or true), then the automaton is automatically seen and handled as alternating?

Yes, it could, but why? As long as no universal branching is used, an alternating automaton should be processable by any tool dealing with non-deterministic automata.

adl commented 10 years ago

The merged commit only support conjunct states. The alternating branch has support for disjunction as well if we need that in the future. But in that event we should probably have support for the "true" destination as well.