Design of identType - Githubissues

anton-trunov commented 3 years ago

So far we have an idea that identType is a type T with a function fresh : seq T -> T with the property (fresh xs) \notin xs.

This, however, presents an issue: the fresh function needs to traverse its input list to generate a fresh value but if we need to call it repeatedly to generate a sequence of fresh identifiers, this resulting algorithm is going to be quadratic.

Instead, we could change the type of fresh to T -> T, i.e. we can generate a new fresh ident given the previous one. (So, for example, an instance of identType for nat would have succn as its fresh function.)

Let me outline several property statements that seem to be useful.

One of the most general ones (it only requires T to be eqType) could look like this (traject lives in Mathcomp's path module):
```
forall (n : nat) (x : T), unique (traject fresh x n)
```
Basically, this says that all powers of fresh function produce distinct elements.
Another option (this would require T to be ordType) is to say
```
forall x : T, x < fresh x
```
And yet another option would be to make fresh of type nat -> seq T generating an n-tuple of unique identifiers:
- forall n, size (fresh n) = n;
- forall n, unique (fresh n).

anton-trunov commented 3 years ago

I had forgotten to mention this design makes it necessary to have an identifier to start from. This can be seen as an inhabited type but I'm not sure if identType should inherit from ihnType because

extending the depth of a hierarchy involves some typechecking performance penalty;
an inhabited type might have a default element which is not necessarily a good starting identifier. But I'm open to suggestions here :)

Btw, the previous version with fresh : seq T -> T kind of had this default element: fresh [::].

volodeyka commented 3 years ago

When we need to use the fresh element? --- at the moment when we try to add a new element in our exec_event_structure, when we want to expand it's domain. Since we don't want to use tuples for modelling finite functions, the domain of exec_event_structure won't be a tuple of sequence, but finSet. That's why if what to have a function that take not one element it must have type {fset K} -> K. Also, If we have function of type fresh_motone : {fset K} -> K with property forall s (x : s), x < fresh_motone s we can easily obtain from it function of type K -> K with property formal x, x < f x, and vice versa for total orders (but not so easily). Moreover if we have well-founded order we can obtain form fresh : {fset K} -> K with property forall s (x : s), f x \notin fresh s, fresh_motone : {fset K} -> K with property forall s (x : s), x < fresh_motone s . So I think we need to ask for fresh_motone : {fset K} -> K because in concrete instances we often have total or even well-founded order

anton-trunov commented 3 years ago

We should be able to recover a function of type {fset K} -> K from the design I outlined (modulo some tweaks with regard to hierarchy). Looks like it's going to be quadratic, unless we ask identType to also be a join-semilattice, for instance in which case can fold the input container (a set, a list, ...) with join-operator and then get fresh on the accumulated result.

eupp commented 3 years ago

The whole point was to get rid of quadratic complexity. I think even the linear complexity is bad if we really care about the computational behavior :) Thus, fresh needs to be constant time, and has to have type T -> T (i.e. it cannot consult the whole trajectory list, as it would imply the linear time or even worse).

anton-trunov commented 3 years ago

Right, but linear complexity is not enforced in the signature, that would be very cumbersome to do. The implementation proposed in PR #22 allows one to instantiate fresh with succn, for example.

anton-trunov commented 3 years ago

Btw, after sleeping on it, I propose to rename identType to freshType. For instance, this would be more in line with the corresponding Fresh Monad in Haskell: https://hackage.haskell.org/package/tamarin-prover-utils-0.8.5.1/docs/Control-Monad-Fresh.html.

eupp commented 3 years ago

Btw, after sleeping on it, I propose to rename identType to freshType. For instance, this would be more in line with the corresponding Fresh Monad in Haskell: https://hackage.haskell.org/package/tamarin-prover-utils-0.8.5.1/docs/Control-Monad-Fresh.html.

I kinda like identType more than freshType. But I agree we might need freshMonad in the future. I'll try to explain my preferences for naming here.

identType is short for identifier type. It denotes that the type T : identType represents type that can be seen as a type of identifiers, i.e. collection of entities with certain properties.

freshMonad is the name of the monad. Generally, monads are used to model effectfull computation, and thus the name of the monad reflects the type of effects it can produce. freshMonad can produce fresh effect, i.e. it can allocate fresh identifier.

In short, the name identType better describes the entities (i.e. identifiers) that inhabit the type, while freshMonad denotes the type of the effect that monad produces.

eupp commented 3 years ago

I had forgotten to mention this design makes it necessary to have an identifier to start from. This can be seen as an inhabited type but I'm not sure if identType should inherit from ihnType because
* extending the depth of a hierarchy involves some typechecking performance penalty;

* an inhabited type might have a default element which is not necessarily a good starting identifier.
  But I'm open to suggestions here :)
Btw, the previous version with fresh : seq T -> T kind of had this default element: fresh [::].

BTW, I also was thinking about the possible instances of identType. I can imagine two main instances.

nat with fresh n := n.+1
string with fresh producing some lexicographically larger string.

Now, for the second case, if we abstract string to just finite sequences of letters of some alphabet T, \i.e. the take the type seq T, we also have problems with seq T being an instance of inhType in the case when T is the empty alphabet. The obvious solution is to consider empty string/sequence [ ] as a valid identifier, but I do not know would it be convinient for an end-user.

anton-trunov commented 3 years ago

In short, the name identType better describes the entities (i.e. identifiers) that inhabit the type, while freshMonad denotes the type of the effect that monad produces.

This makes sense, let's keep the name. Maybe just rename the file identtype.v to ident.v? (or identifier.v)

nat with fresh n := n.+1

Yep, this is the instance I provided in the PR.

we also have problems with seq T being an instance of inhType in the case when T is the empty alphabet.

I think this is fine, one wouldn't be able to introduce a proper fresh function for this singleton type.

eupp commented 3 years ago

ident.v is fine.

Looks like the only remaining problem is whether identType should inherit from inhType. @anton-trunov how significant is the performance penalty. I thought that the packed classes idiom was invented specifically to fight this problem?

I agree that it might be too restrictive to assume that the default element from inhType is the starting identifier. On the other hand, for two known insances, i.e. nat and seq T the 0 and [ ] are reasonable starting identifiers. I think we need to think about other possible instances and use-cases before we made the final decision.

anton-trunov commented 3 years ago

I thought that the packed classes idiom was invented specifically to fight this problem?

Exactly. But it's no magic. The Mathcomp devs managed to curb the depth of their hierarchy at around 9. Proving usually requires deeper hierarchies compared to programming.

On the other hand, for two known insances, i.e. nat and seq T the 0 and [ ] are reasonable starting identifiers.

Indeed, sounds reasonable but if you aim at a reusable library I'm sure there will be a use case where is too restrictive for some one.

I think we need to think about other possible instances and use-cases before we made the final decision.

+1

eupp commented 3 years ago

A minor stylistic suggestion. Can we agree to name the variables of type identType as id, id', id1, id2 etc, instead of ident and its variations? The id is shorter and still has clear semantics. If we will choose this convention, I would propose to rename ident0 into id0.

Other options for ident0 (if we don't want to have char 0 in the name) are first or start.

@anton-trunov @volodeyka @dmitromikh

volodeyka commented 3 years ago

A minor stylistic suggestion. Can we agree to name the variables of type identType as id, id', id1, id2 etc, instead of ident and its variations? The id is shorter and still has clear semantics.

id conflicts with identity function, unfortunately. And in most cases, we are looking at elements of identType as on events, and name them e, e1, e2... So maybe ident is better. But if we want to somehow name a fresh identifier we can use something like fresh_id (in transitionsystem.v for example)

eupp commented 3 years ago

id conflicts with identity function, unfortunately.

Fair enough. Let's stick with ident and its derivatives.

But if we want to somehow name a fresh identifier we can use something like fresh_id (in transitionsystem.v for example)

In that context, fresh_id looks to verbose, comparing to other event identifiers like e, e1, e2, etc. So we need to invent a better convention. We can just name the fresh event simply as e (not so good), en (n here to reflect it is n-th event added on the n-th step), or e_fr (short for e_fresh). I personally like en. Other suggestions are welcome.

volodeyka commented 3 years ago

In that context, fresh_id looks to verbose, comparing to other event identifiers like e, e1, e2, etc. So we need to invent a better convention. We can just name the fresh event simply as e (not so good), en (n here to reflect it is n-th event added on the n-th step), or e_fr (short for e_fresh). I personally like en. Other suggestions are welcome.

I vote for e_fr or efr

anton-trunov commented 3 years ago

In that context we never use fresh. How about using it? It's just one symbol longer than e_fr but actually readable.

It was my first solution :)

eupp commented 3 years ago

In that context we never use fresh. How about using it? It's just one symbol longer than e_fr but actually readable.

It was my first solution :)

Did you proposed it in some PR? Sorry if I missed that and then have repeated it here.

It looks like we still haven't reached consensus. I vote for en, @volodeyka for e_fr/efr, @dmitromikh is silent. Thus @anton-trunov has casting vote :)

anton-trunov commented 3 years ago

I'd rather have fresh spelled out fully (as most papers seem to do), but if need be then something that has at least fr in it should be acceptable. Let me refresh the context of fresh_id usage and I'll get back to to this issue.

eupp commented 3 years ago

I'd rather have fresh spelled out fully (as most papers seem to do), but if need be then something that has at least fr in it should be acceptable. Let me refresh the context of fresh_id usage and I'll get back to to this issue.

Ok, thanks!

Just in case: we speak about the particular context, inside the construction of event structures. There, we primarly use e to denote events, and thus fresh_id stands out. An example of the problem https://github.com/volodeyka/event-struct/blob/294771c1e504d73e05a4ab29ad8a966db5d57ae5/transitionsystem.v#L176

In general I'm ok with fresh_id, it's just this particular use case that bothers me.

volodeyka commented 3 years ago

I vote for en, @volodeyka for e_fr/efr, @dmitromikh is silent.

I vote for fresh

eupp commented 3 years ago

With @volodeyka we have recently faced a new challenge related to identType. Let E : identType, f and g are two finitely supported functions f g : { fsfun E -> D }, an assume that the support of f is less than that of g, i.e. supp f <= supp g. The let h : supp f -> E be a function that is injective on the support of f. We want to have a canonical way to extend h to the whole type E so that the resulting function is still injective. An intuitive way to do that is to map all elements outside of the support of f to elements outside of the support of g. However, it is not possible to do so with the current design.

Alternatively, assume E == nat. Then there is an easy way to do what we want. Let f be defined on [0 ... n] and let g be defined on [0 .. n + k]. Then we can map n+i to n + i + k.

In order to abstract the described approach, we need to require more structure on E : identType. A new proposal is to require that identType be a countable infinite type. Countability would give us pickle : E -> nat and unpickle : nat -> E. Moreover, we will get ident0, fresh and <= for free:

ident0 := unpickle 0
fresh x := unpickle (1 + pickle x)
x <= y := (pickle x <= pickle y)

Or else, if pickle and unpickle are computationally intensive functions, we can just require that fresh x = unpickle (1 + pickle x).

This new encoding of identType seems to be compatible with 2 most important instances: nat and seq A (i.e. strings over an alphabet of A).

cc @anton-trunov @volodeyka

eupp commented 3 years ago

Another issue related to identType. We currently require identType to be instance of porderType. This gives as order notations: x <= y. However, in porf_eventstruct.v we have an event structure over events of type E : identType. We then declare another instance of porderType over events where partial order is causality order. Currently, it enforces us to use disp machinery and re-declare new notations to distinguish two partial order. Together with @dmitromikh for now we decided to use e1 <=^c e2 notation for causality.

We expect the causality order to be use more often than the order induced by identType. Thus, it would be nice in future to swap the two, i.e to have notation e1 <= e2 for causality order and notation e1 <=^i e2 for the partial order on identifiers.

cc @anton-trunov @volodeyka

Event-Structures / event-struct

Design of identType #21