adl / hoaf

Hanoi Omega-Automata Format
14 stars 2 forks source link

dealing with automata without acceptance set #23

Open adl opened 10 years ago

adl commented 10 years ago

Sometimes (e.g., when translating any safety formula) we obtain an automaton without any acceptance set: in generalized Büchi acceptance it means that all words recognized by the automaton are accepting.

Presently the format does not allow for such empty acceptance, and I think this is a flaw. The Acceptance: line requires a non-empty acceptance-cond:

header-item ::= … | "Acceptance:" INT acceptance-cond
acceptance-cond ::= "I" "!"? INT
                 | "F" "!"? INT
                 | (acceptance-cond)
                 | acceptance-cond & acceptance-cond
                 | acceptance-cond | acceptance-cond

Should we allow Acceptance: 0 without the acceptance-cond part in this case? Or should we prefer something like Acceptance: 0 t? (with t for true, as in labels).

A related question is the definition of parametric acceptance names when the parameter is 0. Assuming Acceptance: 0 is allowed, do we agree it corresponds to the following to names?

acc-name: generalized-Buchi 0
acc-name: Streett 0

Should we introduce a dedicated name for Acceptance: 0? (e.g. acc-name: monitor.)

Also, should we allow those?

acc-name: generalized-co-Buchi 0 
acc-name: Rabin 0
acc-name: generalized-Rabin /* no digit */

Because of the disjunctive nature of the above three acceptance conditions, their empty variant should be equivalent to false, i.e., an automaton rejecting all words. Does it make any sense to support it?

JanKretinsky commented 10 years ago

I guess most of the time when the acceptance condition is trivial, the automaton is not very complex either. Therefore, I'm fine with not having any explicit way to write it down and enforcing at least one set expression in the acceptance condition - for Buchi I0 that is used everywhere, for coBuchi F0 that is never used. This would also avoid the problem 'What is the condition type of "0"?'.

kleinj commented 10 years ago

I think having 't' and 'f' as atoms in acceptance conditions would be fine. This makes it easier to do output (no need to have special cases or artificially create acceptance sets) and is an interesting information for consumers as well.

One additional idea would be to allow "I x", "F !x" but also "I t" and "F f", etc in acceptance-cond. This would allow to write down acceptance conditions where some of the sets include all of the states or none. E.g., if we are requested to output a Rabin condition but actually need only a Büchi condition. But this is just an optimization.

For the parametrized acceptance names, the big conjunction / disjunction specifies whether Acc: 0 t or Acc: 0 f should be the intended semantics, so I don't think we should disallow those. We just have to note this in the definitions so consumers are not surprised.

For Acceptance: 0 t, 'monitor' would be fine, other options would be 'safety' or 'all-runs'. We should just check that the term also makes sense in the alternating setting. For Acceptance: 0 f, 'none'?

I think it makes sense to include the 'false' variants, as e.g. a tool just doing complementation for deterministic automata could just negate the acceptance condition and the 'acc-names' it knows, without having to touch any of the states and transitions.

adl commented 10 years ago

@JanKretinsky: I can give a couple of reasons why I would like to support automata with 0 acceptance sets. One is that it occurs naturally when translating safety formula. A second is that when building a "generic" product of two automata (each with its own acceptance condition), you will have to create a disjoint union (after renaming) of the acceptance sets, and build the conjunction of the two acceptance conditions (after applying the same renaming) for the result. In this setup, it is clear that building a product with an automaton with 0 acceptance sets will not contribute any new acceptance sets to the result. If we arbitrarily force all automata to have one acceptance sets, the product becomes either less efficient (assuming we propagate the the extra acceptance set to the result) or clumsier to write (assuming we first check wether the declared acceptance set is actually useful so we do not introduce a new acceptance set in the result). Finally, a third point is that such an "accept all" acceptance condition gives some important property about the language of the automaton (it's necessarily a safety property), while forcing it to be Acceptance: 1 I1 would hide this fact.

@kleinj: I buy the use of f in addition to t to ease complementation of acceptance (you would still have to propagate the negations downward). However I'm not convinced by this paragraph:

One additional idea would be to allow "I x", "F !x" but also "I t" and "F f", etc in acceptance-cond. This would allow to write down acceptance conditions where some of the sets include all of the states or none. E.g., if we are requested to output a Rabin condition but actually need only a Büchi condition. But this is just an optimization.

The problem is that we have defined acc-name: Rabin 1 to describe Acceptance: 2 F0 & I1. So if a tool parses only acc-name: Rabin 1 it will expect the automaton to use two acceptance sets with specific meanings. Therefore if you want to output Acceptance: 1 Ft & I0, you cannot call it Rabin 1, and at that point you might as well write Acceptance: 1 I0. (I think that if you have a tool that only works with Rabin and would like it to support Büchi automata, you could simply interpret Acceptance: 1 I0 as if it was Ft&I0, and this will make the tool more interoperable than if Ft&I0 had to be explicit the format.)

My expectation is that if I ask for a Büchi automaton, the producer tool should do whatever it needs to output an automaton with acc-name: Buchi Acceptance: 1 I0, adding a superfluous acceptance condition or degeneralizing whenever needed. Similarly, if I ask for a Büchi automaton to be converted as a Rabin automaton, I would like the result to have acc-name: Rabin 1 Acceptance: 2 F0&I1.

If we want f, I don't like monitor anymore as a name for t, because I think the the names we give to t and f should be related (and co-monitor sounds weird). I like all and none, or even true and false.

JanKretinsky commented 10 years ago

With Rabin automata, it's quite often the case that one of the components is trivial. I'm not sure it's really important to actually convey the information, but supposing we want to do that, then something like

acc-name: Rabin 3 none 0 2 all 5

could declare a condition I1 | I3 | F4 while (1) keeping the Rabin shape (2) indicating trivial parts and (3) not cluttering the text with 5 everywhere

kleinj commented 10 years ago

@JanKretinsky: Mmh, if we want to go in this direction, what about the following. For an acc-name field, the condition and the first parameter(s) specify the structure of the acceptance condition, e.g., 'Rabin 3' signifies a Rabin condition with 3 pairs. There are thus 6 'slots' with a precise role that have to be filled. We could allow each of these slots to either be filled by an acceptance set that is explicitly defined or by 't' or 'f'. So, in your example above, you would specify (if I got it right)

acc-name: Rabin 3    /* 3 pairs */
f 1      /* 1st pair */
f 3      /* 2nd pair */
4 t      /* 3rd pair */

Naturally, you would then get by with just having acceptance sets 0,1,2. For parity, this would result in a list of the acceptance sets (or t/f) for the different colors.

If we go this way, it probably would make sense to allow negation for the integer-acceptance-sets in acc-name as well, as then it would be possible to easily specify several acc-names simultaneously, e.g. for a parity automaton additionally a Rabin and Street acc-name. Basically, this would allow more flexibility in specifying the acc-name fields while retaining the strict structure of the acceptance condition offered by the acc-name method. In the end, the Acceptance header is authorative and we would require that the acceptance condition specified by each acc-name is 'acceptance equivalent' with the authorative one (for this specific automaton).

I think for a consumer this doesn't add much complexity, as it involves just one additional lookup to determine the actual set of states referenced in the acc-name line.

adl commented 10 years ago

Here is a patch that addresses the original issue (automata without acceptance sets) by allowing t and f in acceptance-cond. Sounds OK?

kleinj commented 10 years ago

Yes, looks fine. I commited two tiny changes.

kleinj commented 10 years ago

I'm still interesting in the flexible acc-name definitions using slots as proposed in https://github.com/adl/hoaf/issues/23#issuecomment-45722441. If there are no strong objections to the idea, I would try and formulate a corresponding text. We can then have a look if we actually want that flexibility.

adl commented 10 years ago

I've merged the patches to master and deleted the issue/23 branch.

Regarding the Rabin specification @JanKretinsky and you have been discussed, I'm just not really convinced that the described hypothetical (?) use cases warrant the extra complexity of the specification. Since everything you are trying to put into acc-name: can already been done with Acceptance: I would keep acc-name: only for the simple cases everybody is used to, and that can be described without making any reference to specific acceptance set numbers (only acceptance set counts).

That said, I'm not involved with the production of Rabin/Streett automata enough to emit any strong objection about it. So if you think this is really needed, I won't argue.

kleinj commented 9 years ago

I still like the idea of being able to specify multiple acc-names for a given automaton, with the rationale as follows. There are several cases where the same language can be described by an automaton with the same transition structure but different acceptance conditions, i.e., parity directly induces a Rabin and Streett condition, a particular DRA could also be a DBA, etc. I think we can not expect that consumer can always deal with the full complexity of general Acceptance, but in most cases need to employ tailored algorithms for specific acceptance conditions. In general it would thus be interesting if a HOA lists as many different acc-names as possible, perhaps even by putting a tool between the producer and consumer that analyses the automaton and adds acc-names, e.g., in the case that a DRA can be represented as a DBA, etc.

If we want to have multiple acc-names, then we need more flexibility in describing the mapping of the 'slots' of the acc-name specification to the actual sets defined in the automaton. We could have the following syntax (or something like this):

The default acc-name syntax would be as before, as well as the description of the slots that arise from the parameters of the acceptance condition, e.g., Rabin 3 would induce slots 0 ... 5, which would by default correspond to the acceptance sets 0 ... 5 in the automaton:

acc-name: Rabin 3

Optionally, after specifiying the parameters, one could add - for each slot - a boolean formula over the acceptance set numbers defined by the Acceptance header:

acc-name: Rabin 2 [t] [1&2] [f] [!3]

I don't know if we should use brackets or some other syntax. An acc-name header would thus define acceptance as before, just with more freedom in mixing the sets, and a consumer can pick the most suited from multiple acc-name headers. The producer would have to ensure that the language arising from the Acceptance-header would be the same as the language when constructing an alternative Acceptance-style condition from the information provided in an acc-header.

From an implementation point of view, this should be quite easy to realize, as one has the sets of states/transitions explicitely marked in the HOA file and one can easily reconstruct the membership in the sets as defined by the acc-name.

I don't know if this is a feature for version 1 of the format. As the more complex/flexible notation would be optional in any case, we could just add it in a later stage, with the appropriate tool support and more experience whether this would be worthwile.

strejcek commented 9 years ago

I'm not sure if there is a real need for automata with acceptance condition simultaneously described in more alternative ways. Tools producing automata typically produce automata of a fixed type. Applications consuming automata typically accept automata of a fixed type. What we actually need is just a tool like autfilt that accepts arbitrary automata and transform them into a specified type (if possible).

I think that potential advantages of multiple acc-name: support are not worth the increased complexity of format specifications. Of course I may be wrong. Anyway, I prefer to keep acc-name simpIe (as it is now) for 1.0 version.

adl commented 9 years ago

I'd like to see two kinds of tools: "simple" tools that only deal with one kind of acceptance, and "advanced" tools that can deal with the general acceptance. ("simple" and "advanced" probably aren't good words: I'm only using them from the point of view of acceptance support.)

I expect simple consumer tools to usually read only the acc-name: line, and I expect simple producer tools to only output acc-name: with "default slots" (i.e. no slot specification).

For advanced tools, I expect them to work with the Acceptance: line, and ignore the acc-name:. This certainly is harder to do, especially if you need to "recognize" some usual acceptance conditions from the given general specification, so I'm glad acc-name: makes it easier for simple tools.

If some simple tool produces a Rabin automaton that is DBA realizable, and you want to feed that to some simple tool that reads Büchi automaton, then you indeed need a middle tool to change the acceptance condition, but this middle tool can do it in a way that an acc-name: with "default slots" is used. In this context, I see no reason for the simple consumer tool to implement support for the slot specifications. If the producer of the Rabin tool was able to figure out that the produced Rabin automaton is DBA realizable, then I guess it would make sense to have an option to output a Büchi acceptance whenever possible instead of always Rabin.

If I write a simple tool that supports Rabin acceptance, and then I decide to also support Büchi or co-Büchi acceptance directly (without going through some conversion tool) because these are subset of Rabin, then I feel it is easier to be to teach my tool to recognize acc-name: Buchi or acc-name: co-Buchi in addition to acc-name: Rabin n: this way I can still read automata produced by simple tools that know nothing about Rabin and those tool should not be expected to produce some acc-name: Rabin n [with slots] just in case.

So to summarize I think using slots to masquerade one acceptance as another one adds complexity for both input and output where it is not needed: either tools that produce one kind A of acceptance have to be aware of other acceptances A can masquerade into (which doesn't sound correct), or a middle conversion tool is needed (but if such a tool exist it can also rearrange acceptance sets so that the slot specifications are not used).

(TLDR: I agree with Jan.)

kleinj commented 9 years ago

Ok, I agree that this can be handled via intermediate tools. If we need to, we can easily revisit this later.

JanKretinsky commented 9 years ago

For now I'm fine with that, but in v2 I want human readers to be able to read that e.g. the second Rabin pair is just Buchi and the third one is just co-Buchi even though the automaton is just declared as Rabin. Whenever there are more Rabin pairs, it is very annoying to read through the automaton just to figure out what is going on.