adl / hoaf

Hanoi Omega-Automata Format
13 stars 2 forks source link

add "semi-deterministic" property #52

Open strejcek opened 8 years ago

strejcek commented 8 years ago

For definition see e.g. https://en.wikipedia.org/wiki/Semi-deterministic_B%C3%BCchi_automaton

kleinj commented 8 years ago

Seems useful. Do you happen to know if this is the same concept as the "deterministic-in-the-limit" automata sometimes used in the probabilistic model checking context?

strejcek commented 8 years ago

Hmm, it seems to be the same (and also known as "limit-deterministic" automata).

adl commented 8 years ago

Is there any name that is used more frequently than the other?

I find limit-deterministic much more descriptive than semi-deterministic.

strejcek commented 8 years ago

We have submitted a paper on complementation of semi-deterministic BA to TACAS. When writing it, we discussed whether to use "deterministic-in-the-limit" or "semi-deterministic" and we have decided to use the latter one. I found "deterministic-in-the-limit" as the most descriptive, but it is too long and hard to work with it in the text. "Limit-deterministic" is used much less frequently, only as a shorter version of "deterministic-in-the-limit". "Semi-deterministic" is used frequently and easy to work with.

Also note that the definition of semi-determinism on wikipedia is flawed.

kleinj commented 8 years ago

What is the flaw in the Wikipedia definition? If I unterstood correctly, for Büchi the definition amounts to "once an accepting state has been seen, the automaton has to behave deterministically".

Do we want the property to only apply to Büchi? Or in general? How would the definition look like, there? Something in the spirit of "after any transition that is a member of an acceptance set, the automaton is deterministic"? I don't know how much sense that makes in the presence of Fin in the acceptance condition.

adl commented 8 years ago

I guess the general definition should be that A[q], the automaton obtained from A by changing its initial state to q, should be deterministic whenever q belongs to some accepting SCC of A. As a consequence, non-determinism can only occur before entering an accepting SCC.

(By "accepting SCC" I mean an SCC that contains an accepting cycle.)

I do not like the "once we have seen an accepting state" used in the Büchi case: if a trivial SCC has its state marked as accepting for some reason, it should not have any influence on that property, because this cannot be part of an accepting loop. Maybe this is the flaw Jan was speaking about.

Another flaw in the wikipedia definition is that it says "Q has two disjoint partitions N and D" instead of "Q can be partitioned in two disjoint sets N and D".

kleinj commented 8 years ago

In the Büchi case, I can see the motivation for going with the "after F everything is deterministic" formulation: You get a simple syntactic criterion that can be easily verified and also know precisely the point where the algorithmic treatment might differ (switch to the deterministic part). In that sense, it's a convenient normal form. Your "semantic" definition would require some graph / SCC analysis for these parts. But I'm not familiar enough with the details of the use cases to have an opinion on whether this convenience is actually beneficial in practice.

For arbitrary acceptance, I think your definition makes sense. Do we know if deterministic-in-the-limit has been considered for non-Büchi acceptance somewhere in the literature? As DIL-Büchi can handle all omega-regular languages, I wouldn't be surprised if that hasn't come up yet.

strejcek commented 8 years ago

The flaw in wikipedia: the formal definition given there says that the automaton "(D,Σ,∆,{d},F) is deterministic". But a state in D can have a successor in N. This is clearly a bug.

I do not know about any case where semi-determinism is used with another acceptance than Buchi.

I think there is a space for generalization of semi-determinism, but I'm not a big fan of the semantic generalization suggested by Alexandre. There are two reasons for that:

  1. with general acceptance, it is quite hard to find out whether an SCC contains an accepting run or not (it is definitely not enough to check occurrence of acceptance sets in the SCC). Hence, it is difficult to say whether an automaton is semi-deterministic or not.
  2. it is not clear how to use such information about semi-determinism: in Buchi automata, I know that the behaviour is deterministic as soon as I pass some accepting state, but in the suggested generalization, how would I recognize the point from which determinism is guaranteed? The only chance is again to decompose the automaton to SCCs and determine which SCCs contain accepting cycles. This is expensive.

I suggest another natural generalization of semi-deterministic Buchi automata to general acceptance: the automaton is deterministic in all states reachable from acceptance set 0 and each accepting run passes this acceptance set. Decision whether an automaton is semi-deterministic according to this definition or not is similarly difficult as in the semantic generalization, but application of this information is much easier (one immediately knows when the behaviour has to be deterministic).

P.S.: Our paper about complementation of semi-deterministic BA has been accepted to TACAS.

adl commented 8 years ago

Congrats on your TACAS paper!

Regarding point 2, if your interest is to know what part of the automaton is deterministic, I would just ignore the semi-determinstic property. Just test what states are nondeterministic and make some transitive closure. That will give you a precise result (more precise than relying on accepting states in the Büchi case), without the hard work of looking for accepting SCCs.

My view on property: in general is that trivial properties like deterministic, complete are almost useless. Tools that needs input to be deterministic or complete can easily check for it. I can see some use-cases (e.g., it's convenient for us humans when reading the file, and it allows playing with tools like grep) but I see little added value. Properties that are harder to check, like sttutter-invariant, inherently-weak, or even weak or unambiguous have real value: it allows you to have one tool that does the hard check, and another tool that benefit for it. For instance use ltl2tgba -B -H --check=stutter ... to generate Büchi automata labeled with stutter-invariant if they are, and feed them to ltl2dstar that will turn on some optimizations when the property is given.

So that brings me to point 1. Yes, the proposed property requires deciding SCC acceptance, and that's hard, but I can't accept that as an argument against the property for the reasons pointed above. Besides, recognizing this property could be implemented quickly in Spot. AFAIC, the main question is whether the property would be useful to, e.g., some probabilistic model checker. I'm not familiar enough with probabilistic model checking to answer that.

I assume from your remarks that it is important to know where the transition from non-deterministic from deterministic-from-now-on occurs, but it seems very strange to me that one would encode that using the acceptance condition just because it can be done with Büchi (yet imprecisely). Does it need to be encoded at all?

I suggest another natural generalization of semi-deterministic Buchi automata to general acceptance: the automaton is deterministic in all states reachable from acceptance set 0 and each accepting run passes this acceptance set.

This seems broken for co-Büchi acceptance :-( And also for t acceptance.

I have two more "data points" to contribute to our search for THE definition.

I stumbled upon some other definition of semi-deterministic in Vardi's paper (e.g., LICS'86 or ARTS'99) which is even more restrictive than the one that relies on Büchi states: in these semi-deterministic automata the transition function is deterministic and the only place where non-determinism is allowed is in the choice among multiple initial states.

Searching for "deterministic in the limit" lead to more relevant papers. In particular, Safra has a very nice (to my taste, at least!) definition for these automata that in FOCS'88:

An automaton is deterministic in the limit if any accepting run makes finitely many nondeterministic choices.

I believe it is equivalent to (but way more elegant than) what I suggested. (edit: Jan's example below demonstrates that the definitions are different.)

strejcek commented 8 years ago

After I've posted my previous comment, I've been thinking about it again and I came to a decision that your original suggestion was perfectly OK. After I've read your last comment, I agree that the definition from FOCS'88 is even more elegant. The problem is that the two definitions are not fully equivalent: the one using accepting SCCs implies the one from FOCS'88, but the reverse doesn't hold:

HOA: v1.1 States: 2 Start: 0 Alphabet: 2 "a" "b" Acceptance: 1 Fin(0) acc-name: co-Buchi --BODY-- State: 0 {0} [0 | 1] 0 [0] 1 State: 1 [0] 1 [1] 0 --END--

It has one SCC containing an accepting cycle (the loop on state 1) and nondeterministic state 0, hence is not semi-deterministic according to the first suggestions. But it is perfectly semi-deterministic according to FOCS'88.

Currently I'm not able to say which definition I like more: the one from FOCS'88 is definitely more elegant, but I'm not sure how easy or hard it is to check that an automaton with generic acceptance is semi-deterministic according to FOCS'88, and how easy or hard it is to utilize this semi-determinism.

Does anyone have any preference?

adl commented 8 years ago

Nice example, thank you.

Here is another one that does not involve Fin. Consider this Büchi automaton: out

Here any accepting run makes a finite number of non-deterministic choices, and yet some nondeterminism appear after an accepting SCC.

strejcek commented 8 years ago

Another nice example, thanks.

So we know that there is a semantic difference between the two definitions of semi-determinism.

Do you have any preferences which one to adopt? Or should we support both, e.g. semi-deterministic-semantics for FOCS'88 and semi-deterministic-structure for your original suggestion?

adl commented 8 years ago

(Sorry for the delay, there are too many things going on for me these days.)

I cannot express any preference between the different definitions, because I do not understand how these definitions are used in practice. I wanted to read the paper I cited, but I haven't got around to do that yet. Can you explain to me why these definitions are useful? When are semi-deterministic automata needed?

So far the only definition for which I can see some potential use-cases is the very strong form of semi-deterministic given in Vardi's papers (e.g., LICS'86 or ARTS'99). If nondeterminism is only allowed for initial states, I can see how we could design algorithms that work specifically with this class of automata. For instance it should be easy to implement an algorithm that makes the union or intersection of two semi-determinic automata, while preserving this type of semi-determinism.

adl commented 8 years ago

By the way, Joachim told me that he planned to comment on some of those issues (and I hope this one, since he has some papers published on the subject), but he seems to be rather busy too.

kleinj commented 8 years ago

Ok, I now had time to have another look at the literature and think a bit about it. For the applications (e.g. for qualitative MDP checking), it seems that it is not absolutely crucial to know which part of the automaton if the "deterministic" part, as they generally employ some kind of cycle detection in the product, which would filter out non-accepting runs anyways. So Safra's definition

An automaton is deterministic in the limit if any accepting run makes finitely many nondeterministic choices.

should work. However, knowing the states of the part where the acceptance condition is relevant can be used for optimizations, as only those parts need to be handled for acceptance checks.

My suggestion would be the following 'tweak' of Safra's definition:

An automaton is deterministic in the limit if any accepting run eventually only visits deterministic states.

With this definition, the automaton can be naturally separated into two parts by a structural inspection, i.e., determining the set of deterministic states where all states in Post* are deterministic as well. Then, any acceptance condition is only relevant in that part and can be ignored in the other part. On the other hand, we get a quite elegant formulation. And it would certainly not be too restrictive in practical terms. If there are actual tools that exploit the flexibility as shown in https://github.com/adl/hoaf/issues/52#issuecomment-170873697, we might add another property to cover that, later on (deterministic-for-accepting?).

BTW, I have a slight preference for deterministic-in-the-limit, deterministic-in-limit or limit-deterministic over semi-deterministic. For me, at least, it more clearly invokes the crucial idea.

Another thing: I guess the definition above ("eventually only visits deterministic states") makes sense for alternating automata as well, right? This would force any path on the run tree to eventually become linear.

strejcek commented 8 years ago

After another round of discussion (this time with Joachim and Fanda in Eindhoven), we suggest to introduce two properties:

There are two possible changes in this definition:

  1. We can replace non-deterministic choice directly by non-deterministic states.
  2. We can say that even for aleternating automata, each accepting run has only finitely many non-deterministic choices/states (which is more restrictive than the definition above).
    • semi deterministic hints that all states reachable from any accepting cycle are deterministic.

I'm not really sure whether this definition makes some sense for alternating automata (but this also holds for the definition of inherently-weak for alternating automata.

Let me know if you agree or not, please.

xblahoud commented 8 years ago

I agree with the properties. I like the definition of deterministic-in-the-limit as it is. In case of alternating automata I am against suggestion number 2. For semi deterministic I am fain that it does not make too much sense for alternating automata.

adl commented 7 years ago

FWIW, Spot 2.3 has support (i.e. input, output and check) for properties: semi-deterministic with the sense given above: all states reachable from an accepting cycle are deterministic. The check is not performed on alternating automata.

sickert commented 7 years ago

I'm sorry that I'm joining the conversation a bit too late. Especially, since the terminology already converged on semi-deterministic. I just wasn't aware of the discussion and it just was pointed out the me couple of days ago.

I'll use the following abbreviations:

For the expressivity we obtain the following inclusions: DPA = NBA = LDBA > SDBA = DBA

Regarding the generalization to other acceptance conditions or transition modalities (alternating, universal) I recommend to restrict it to non-deterministic (generalized-)Büchi automata: algorithms for qualitative probabilistic model checking are only defined on LDBAs: JACM'95

Defining it for other modalities or acceptance conditions might lead to unexpected side effects and is problematic if someone comes up with a good generalization of the concept, since it could be incompatible with the one HOA defines. Thus I would leave it undefined until someone has a proper use-case for an extended definition.

I assume from your remarks that it is important to know where the transition from non-deterministic from deterministic-from-now-on occurs, but it seems very strange to me that one would encode that using the acceptance condition just because it can be done with Büchi (yet imprecisely). Does it need to be encoded at all?

For the implementation of ATVA'16 we would like to encode this in ε-transitions that are non-standard (#61). Thus instead of serializing to HOA (the parser chokes on that) we passed the automaton within memory. Further for our TACAS'17 LDBA to DPA translation the split into initial and accepting component is information we make use of. Of course in this case we could just recompute it.

Unfortunately a naming-schism occurred in the community: On the hand there is a group preferring limit-deterministic (or deterministic-in-the-limit): JACM'95, TACAS'15, CAV'16, ATVA'16 and two papers at TACAS'17. On the other there is a group preferring semi-deterministic: TACAS'16 and ProbDiVinE.

I still advocate naming it limit-deterministic: an accepting run can use non-determinism for an arbitrary long prefix, but in the limit is has to be deterministic. But I guess this is already settled and nobody wants to change it.

As a last side note: JACM'95 consistently uses deterministic-in-the-limit to describe the automaton. However, they call the translation semi-determinisation, which is a bit unfortunate.

adl commented 7 years ago

@sickert the way I read your definition of limit-deterministic (a name I would also favor over deter-ministic-in-the-limit just for conciseness), it is compatible with what Jan, Fanda, and Joachim suggested.

Unless I'm mistaken, limit-deterministic as you and they defined, implies that accepting SCCs have to be deterministic (so the example Büchi automaton I gave one year ago is limit-deterministic), while semi-deterministic implies that the subautomaton reachable from any accepting SCCs is deterministic. So semi-deterministic would be stronger than limit-deterministic.

Also we have been discussing (by email) the even stronger notion of cut-deterministic: an automaton is cut-deterministic if there exists a set of transitions (the cut) that, once removed, partitions the automaton in two deterministic parts such that the initial state is on one side of the cut, and the accepting SCCs are on the other side of the cut.

sickert commented 7 years ago

@sickert the way I read your definition of limit-deterministic (a name I would also favor over deter-ministic-in-the-limit just for conciseness), it is compatible with what Jan, Fanda, and Joachim suggested.

Yes, it is compatible in the sense that every LDBA using the semantic definition (accepting SCC is deterministic) can be translated to an automata using the syntactic definition (all transitions after accepting states or transitions are deterministic). E.g. the TACAS'17 translation assumes the syntactic definition, but it should be feasible to fine-tune the algorithm to support the semantic definition.

To address this we could also introduce the -syntactic and -semantic suffix as suggested by previous comments.

Unless I'm mistaken, limit-deterministic as you and they defined, implies that accepting SCCs have to be deterministic (so the example Büchi automaton I gave one year ago is limit-deterministic), while semi-deterministic implies that the subautomaton reachable from any accepting SCCs is deterministic. So semi-deterministic would be stronger than limit-deterministic.

I might misunderstood or confused the definitions here: what do you exactly mean by stronger? The expressivity is the same, there are just different syntactic and semantic criteria, right? On the other hand LICS'86 defines a weaker version: everything is deterministic and the initial state is chosen non-deterministically. But I agree with the other comments that this class might be uninteresting since we already have DBAs and the expressivity is the same as DBAs.

Also we have been discussing (by email) the even stronger notion of cut-deterministic: an automaton is cut-deterministic if there exists a set of transitions (the cut) that, once removed, partitions the automaton in two deterministic parts such that the initial state is on one side of the cut, and the accepting SCCs are on the other side of the cut.

From the expressivity cut-deterministic automata are as good as limit-deterministic automata. However, limit-deterministic automata might be exponential more succinct than cut-deterministic: The translation from limit-deterministic to cut-deterministic automata was briefly mentioned in JACM'95 and it is done by applying the subset construction to the initial (or non-deterministic) part of the LDBA.

@adl I've forwarded you the e-mail correspondence with Fanda in which we discussed cut-deterministic automata and the applicability to quantitative probabilistic model checking. You might be interested in that. :)

adl commented 7 years ago

what do you exactly mean by stronger?

I mean that being semi-deterministic is a stronger syntactic constraint than being just limit-deterministic, as all semi-deterministic automata (in the sense given above, not the one of LICS'86) are also limit-deterministic. Similarly cut-deterministic is an even stronger syntactic constraint. I was not talking about expressiveness: I see all of those are different constraints to express an arbitrary omega-regular property.

sickert commented 7 years ago

I mean that being semi-deterministic is a stronger syntactic constraint than being just limit-deterministic, as all semi-deterministic automata (in the sense given above, not the one of LICS'86) are also limit-deterministic. Similarly cut-deterministic is an even stronger syntactic constraint. I was not talking about expressiveness: I see all of those are different constraints to express an arbitrary omega-regular property.

Ok, I see.

What about adding the following two new properties:

adl commented 7 years ago

I'm afraid I'm lost. Your definition of limit-deterministic-semantic (all accepting SCCs have to be deterministic) is the one that was suggested for limit-deterministc. Can you provide a precise definition for limit-deterministic-syntactic?

sickert commented 7 years ago

Updated my comment from before accordingly. :)