JavaModelingLanguage / RefMan

4 stars 0 forks source link

\union #13

Closed davidcok closed 2 years ago

davidcok commented 2 years ago

Aargh. I set about implementing \union for locusts this morning, only to discover that as \u is the prefix for a unicode character, a java parser does not like \union. It appears I have not had any backslash-keywords starting with u before. It is a nuisance because the unicode processing happens at an early stage in tokenizing Java source and I'd rather not alter that for the sake of JML.

I'll power through this (I hope), as there does not appear to be a reasonable alternative to \union as a name.

leavens commented 2 years ago

Hi David and all,

Yes, the \u start of a unicode character code is an issue, so we shouldn't have keywords starting with \u. I might suggest using \cup for the union operator (or maybe \set_union).

    Regards,

    Gary T. Leavens
    329 L3Harris Center
    Computer Science, University of Central Florida
    4000 Central Florida Blvd., Orlando, FL 32816-2362 USA
    http://www.cs.ucf.edu/~leavens phone: +1-407-823-4758
    ***@***.******@***.***>

From: David Cok @.> Sent: Wednesday, December 8, 2021 11:14 AM To: JavaModelingLanguage/RefMan @.> Cc: Subscribed @.***> Subject: [JavaModelingLanguage/RefMan] \union (Issue #13)

Aargh. I set about implementing \union for locusts this morning, only to discover that as \u is the prefix for a unicode character, a java parser does not like \union. It appears I have not had any backslash-keywords starting with u before. It is a nuisance because the unicode processing happens at an early stage in tokenizing Java source and I'd rather not alter that for the sake of JML.

I'll power through this (I hope), as there does not appear to be a reasonable alternative to \union as a name.

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJavaModelingLanguage%2FRefMan%2Fissues%2F13&data=04%7C01%7CLeavens%40ucf.edu%7Ccc86ab94259740ef1ba708d9ba65ae34%7Cbb932f15ef3842ba91fcf3c59d5dd1f1%7C0%7C0%7C637745768144383876%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=4pm7uY9ZAIurF436z2q%2BgKP%2F1Qkp227LRCLpY2VwjKs%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAC53NPL4KMXY7BWKNBEI73DUP572XANCNFSM5JUGAXSA&data=04%7C01%7CLeavens%40ucf.edu%7Ccc86ab94259740ef1ba708d9ba65ae34%7Cbb932f15ef3842ba91fcf3c59d5dd1f1%7C0%7C0%7C637745768144393851%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=z9%2FVH2jR2U1oycc8OLf9oLSRUADfmAThRJYEL6ei3LE%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7CLeavens%40ucf.edu%7Ccc86ab94259740ef1ba708d9ba65ae34%7Cbb932f15ef3842ba91fcf3c59d5dd1f1%7C0%7C0%7C637745768144403857%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=rgbUUVOhpNDuA5B0fSHoe4C8Ghq2HbuRAt7w3ow%2BpCg%3D&reserved=0 or Androidhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7CLeavens%40ucf.edu%7Ccc86ab94259740ef1ba708d9ba65ae34%7Cbb932f15ef3842ba91fcf3c59d5dd1f1%7C0%7C0%7C637745768144413848%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=M76ZqeTAXtiBffYoc8W%2B6ojcWkA0ws2bQDvtqYuqI44%3D&reserved=0.

davidcok commented 2 years ago

How about \Union -- works great. I think KeY uses \union. Does KeY use a unicode-aware tokenizer, per Java?

wadoon commented 2 years ago

KeY uses \set_union.

We are not very consistent in our names. For example, the infinite union is: (\infinite_union Object o; p(o); o.*) and subset is \subset, but set difference is \set_minus.

davidcok commented 2 years ago

Oh. My mistake. Do you use \set_union for other kinds of sets as well? And what else would one union besides sets?

I would prefer a shorter name, like \Union.

DO you have \in as well as \subset or do you use \subset for both?

leavens commented 2 years ago

Hi David, Mattias, and all,

I would prefer to have consistent naming of these operators, as that would help beginners learn the language. So I could see \set_union, \set_intersection, etc. as a naming convention, or something like \cup and \cap would work as well.

I think that \in and \subset should have different types, and this different names. To my mind it adds to the confusion of students to confuse sets and elements.

    Regards,

    Gary T. Leavens
    329 L3Harris Center
    Computer Science, University of Central Florida
    4000 Central Florida Blvd., Orlando, FL 32816-2362 USA
    http://www.cs.ucf.edu/~leavens phone: +1-407-823-4758
    ***@***.******@***.***>
mattulbrich commented 2 years ago

Do you have \in as well as \subset or do you use \subset for both?

The concept of a location has not been made visible on the JML level in KeY. We can talk about \locsets, but not locations. In this idea, we also use \subset for in. I've just tried this:

//@ invariant \subset(this.intField, this.footprintLocsetField);

translates to the "element-of" expression in the underlying logic.

However,

//@ invariant \subset(this.footprintLocsetField, this.footprintLocsetField);

is the proper subset relation not the element-of relation.

So, I can understand Gary's point of not confusing elements and sets.

wadoon commented 2 years ago

DO you have \in as well as \subset or do you use \subset for both?

We do not have a general set type in KeY; we only have LocSet. Hence, the set operation are reserved for them.

On the other side, we have plans to bring more ADTs into JML and KeY.

davidcok commented 2 years ago

I agree that it is simpler to always have (loc)sets, not elements, using singleton (loc)sets where needed. Then \subset is always appropriate. After all, single expressions like a[1..2] or a[ ] or o. are really sets of locations.

leavens commented 2 years ago

We seem to have gotten away from the main topic of the name for the union operator on locsets (or mathematical sets). An alternative that is should would be to use + for set union and * for set intersection. This would be an overload, but it has the advantage of being very short.

If you don't like + and *, then I still prefer \cup and \cap (to be short) or \set_union and \set_interesection if we are okay with longer names (which I prefer).

davidcok commented 2 years ago

\cup and \cap are totally non-intuitive I think (I know the logical reference). \set_union and \set_intersection are overly long for my taste, though I'll live with them On my own I'd probably use them but overload + and * (and - for set-difference)

leavens commented 2 years ago

I'm happy with using +, *, and - for these set operations.

davidcok commented 2 years ago

I would say that this is then buying into these operators for other primitive types we might add, as they are meaningful (e.g. general sets, sequences, etc.)

wadoon commented 2 years ago

I'm happy with using +, *, and - for these set operations.

Does this mean, that

//@ set footprint += o[*]; 

becomes also valid?

Are the precedence suitable for set operations?


I would see speaking named operators (e.g., \subset, \cup, etc ...) as mandatory, and the support for overloaded Java operators as optional.

I also would first define the JML data types and their operations, before dive into the discussion which syntax should represent them.

davidcok commented 2 years ago

Aside from +=, I would prefer that that had to be written set footprint = footprint + \locset(o[*]); but that is perhaps a separate discussion.

mattulbrich commented 2 years ago

I am happy with +, -, * for location sets. And for sequences, bags, ...

Experiences with Dafny show that they are quite nice to read. I think they are also understandable for java engineers.

mattulbrich commented 2 years ago

This looks like a consensus to me: +, -, * for location sets. No += and similar for now.

leavens commented 2 years ago

I agree with this._

davidcok commented 2 years ago

OK with me, but I also had the question: can \set_union take 0 arguments: means empty set 1 argument: means identify 2 arguments: of course more than 2 arguments: obvious meaning

I would advocate for this. Certainly internally within OpenJML it is a simplification to do so

mattulbrich commented 2 years ago

OK with me. Good suggestion. KeY can't do that currently, but this is annoying, and it should not be a big deal in the JML->JavaDL translation (I hope).

leavens commented 2 years ago

Are we using both + and \set_union in JML? I thought we had settled on +...

If it's +, then I don't see how we can have anything other than 2 arguments.

davidcok commented 2 years ago

I thought it was both.

leavens commented 2 years ago

Ah, if we have \set_union also, then I think it would be best to only have 2 or more arguments. I can see having one argument, but why 0 arguments (statically)? I would be better to use something else for the empty set, say \emptyset.

davidcok commented 2 years ago

Well, internally at least, it is often the case that I have a list of \locsets and need the union of them. It is a real nuisance to have to special-case the situation where the list has 0 or 1 argument.

At the program level, sometimes one wants to comment in or out arguments. Its is like using Arrays.asList() but not being allowed to have an empty list.

leavens commented 2 years ago

I think it's fine if the internal (or SMT) operator for set-union can deal with 0 or 1 argument, especially if you have a way to pass a list's elements as arguments to such an operator.

However, that flexibility does not need to be reflected to JML users who have no way to apply such an operator to the elements of a list directly. I think it's sensible to keep the language as simple as it can be in this area.

mattulbrich commented 2 years ago

If we only have \set_union, then having to nest that is a real nuisance. If we have +, then the issue does not arise. I am happy with and without multi-ary \set_union. I do not even think that it needs much explanation since it is a typical generalisation of the lattice operator.

I think allowing 3 or more is very clear and self-explaining, allowing 1 is canonical, and 0 is still sensible albeit perhaps not as obvious to everybody.

leavens commented 2 years ago

It still seems that we are agreed on + for set union in JML's user-visible syntax, - for set difference, and * for set intersection. The rest of the discussion seems to be about issues that are not visible in the syntax....

davidcok commented 2 years ago

Just to be clear (@mattulbrich ?) this means we have operator+, *, - in the JML syntax, but not \set_union, \set_intersection, \set_difference (though those may be used internally in a tool).

wadoon commented 2 years ago

@davidcok I would allow both.

The function-style has advantages:

The infix notation is just a mnemonic.


But we should also define the data types of \set_union, \set_intersect, \set_difference. In KeY the case is easy: we only have \locset.

In JML:

leavens commented 2 years ago

I agree that it's sensible to have both the function-style and the infix operator style for these set operations.

Since \locset is a mathematical value type, it should not be considered the same as any Java type, so it should not be considered the same as set.

wadoon commented 2 years ago

@leavens I did not meant java.util.Set, I thought on a built-in JML type set<T> (or better \set<T>). Hence \locset could be represented as \set<\location>, where \location would be a tuple of Object and \field.

mattulbrich commented 2 years ago

Just to be clear (@mattulbrich ?) this means we have operator+, *, - in the JML syntax, but not \set_union, \set_intersection, \set_difference (though those may be used internally in a tool).

I'd say the official standard would be + etc. only. Internally we would keep the others for backward compatibility for a while.

wadoon commented 2 years ago

What is the consensus?

davidcok commented 2 years ago

What MU reiterated: standard has operator+*-, but not \set_union etc. However, I think both KeY and OpenJLM will retain those functional forms internally and likely expose them to users.