Open cxw42 opened 3 years ago
After some reflection on the matter, I've come to the conclusion that a smart-match operator would be a powerful idea and, although you'd need to remember how the built-in classes would behave, the behavior is intuitive anyway and shouldn't be difficult to grok.
I also agree that this would be very useful from a pattern matching perspective if switch
is introduced.
The only thing I'm not fond of is the ~~
operator itself which looks odd to me.
I wonder if we could get away with just using a single ~
as the existing use of the tilde is as a unary rather than a binary operator and we have the precedent of doing the same for the -
operator (and if #986 is accepted the +
operator) without apparently anyone being too confused.
An alternative would be to use some other symbol such as @
or $
which are unused at present though we might want to keep these in reserve for possible future uses.
There is some precedence somewhere also with the operator ~=
at least on lua and probably other. But considering the wanted usage, it looks odd...
I still have some reservations, I need to see it in action and its implementation.
~=
in Lua appears to be the equivalent of !=
in Wren. See here.
That would fit in with your own proposal #985 to allow ~
as an alternative to !
for Bool operations.
~=
does not really make any sense as an assignment operator, because ~
as no meaning as a binary operator (as for !
), and because of the nature of it I don't think it it a good idea at all to allow it...
Well, if ~
were allowed as an alternative to !
, then it would make sense to allow ~=
as an alternatve to !=
.
But you're right that this has nothing to do with compound assignment operators so I've edited my previous post accordingly.
Both ~~
and ~=
are fine for me, I only prefer the second one because of the symmetry with the other equality operators.
The biggest reservations I have is about how you declare such method, because of the inversion, I don't find a practical way to express them properly inside the class.
Well, I think if we used ~=
as the smart-match operator (and I'd be happy with that) , then it would be better to forget using ~
as an alternative to !
and restrict #985 to just implementing &
, |
and ^
on the Bool class.
Hmmm don't know what to think. a ~= b
would only have some meaning as ~(a == b)
as per symmetry with !=
which should make it strictly equivalent to a != b
. So the trivial implementation does not really have a real meaning/benefit.
I'm not very comfortable with the definition of the rules in general and the Object
one in particular. It has too much potential meanings, which only depends on the right hand side of the operator contrary to in
, and can be a source of error/confusion.
Well, if we do introduce compound assignment operators, then ~=
is not going to be one of them because the bit-wise complement operator ~
is unary.
So, I think it would be reasonable to use ~=
for smart-matching which you said you preferred to ~~
yourself.
However, to avoid overloading ~
too much, I'd drop the idea of using it as a Bool operator as we don't need it for that purpose anyway.
Binary ~
, ~=
are fine with me, or =~
for another option.
I thought about =~
but I discarded it because foo=~bar
is ambiguous. It can be:
foo
on bar
~
on bar
to foo
@mhermier good point about the possible parse ambiguity.
Re. ~=
vs ~(==)
https://github.com/wren-lang/wren/issues/989#issuecomment-826318241 --- it's a fair point.
~
is not logical negation, I don't think the analogy with !=
necessarily holds.~
.Tilde is nice because it connotes "like". However, if it's too problematic, my next choice would be @
, @=
, or ::
.
@
, e.g., if(foo @ [1,2,3])
: Is the LHS "at" (in the same region as) the RHS? And it's a symbol that's not currently used.::
, e.g., if(foo :: [1,2,3])
: :
generally expresses a relationship. The colon can be very valuable, though, so I wouldn't just use a single colon here.
::
used much in other languages outside of BNF and namespacing. Wren doesn't have namespaces, and if it did, I would recommend using .
rather than introducing a separate scope-resolution operator.There might also be a case for using $
, also currently unused, which is like an S
with a vertical bar through it.
The S
is suggestive of 'smart' and |
is used in some languages as a delimiter in match
statements.
TBH, I don't know which I like best.
@cxw42 As it's your idea, I think you should choose :)
To me, because of Smalltalk, @
is the coordinate operator: when put between 2 numbers it produce a Point
.
::
is problematic because of C++
which makes it more like variable lookup...
Side note: this is the reason I use logical unary left .
to access top level scope on my personal branch ^^
@PureFox48 thanks :) . I did some typing tests to check ergonomics, and I thought of one other option: ~:
(the "parrot" operator? :D ). That has the advantage over ~=
that (on my keyboard) I don't have to lift the Shift key in the middle of the operator.
My preference would be ~~
first, then ~=
, ~:
, ::
, @=
, @
, $
. I have strong personal associations between $
and variables (e.g., shell vars), which is the only reason I would prefer it least.
@cxw42
Well, as @mhermier doesn't like @
or ::
and ~~
is your first preference, let's go with that.
It doesn't really have any technical problems, it will be familiar to those who know Raku and there are plenty of precedents for using a doubled symbol as an operator.
Although it looked a bit odd to me at first, I think I'm beginning to warm to it :)
It is not that I don't like it, it is just that there are strong connotations, that would make a hard learning curve.
As I don't know Smalltalk, the only meaning @
has for me is 'at'.
I agree though that ::
wouldn't be a good idea as it will have strong connotations as a scope resolution
operator to many people.
I take it you're on board with using ~~
, as originally proposed ?
A further thought.
Would it make sense to have a second operator !~
to mean not a match?
I support that, and I doubt much existing code logically negates the result of a bitwise complement :D
I hadn't even realized that something like !~42
was legal before but apparently it is (it returns false) because the Num class is inheriting the !
operator from Object.
I don't think this means that !~
(and for that matter ~~
) wouldn't be viable as we'd be using it as a binary operator rather than two successive unary operators.
Incidentally, having a negative match operator would further enhance the attraction which the smart-match operator has compared to in
for expressing containment.
Instead of: !(x in [1, 2, 3])
we could simply write x !~ [1, 2, 3]
.
Also being able to write something like x !~ Num
when checking x
's type would compensate for not having a negative is
operator.
!~~
or !~=
not the best elegance but can do if needed.
Off topic: I suspect that this is a sign that the real equality operator is
=
and not ==
as per !=
shows, following that logic... Even more
proofs with >=
, <=
... I understand the motivation of C for requiring a
short assignment operator, but you rediscover the inconsistencies by trying
to follow the same logic and it fails...
The way I'm seeing this right now is that ~~
and !~
would be analogous to ==
and !=
.
So the =
symbol would be replaced by ~
to reflect the fact that the operator is smart-matching (which may test for containment etc) rather than always testing for equality.
I've gone off using ~=
altogether. Even though it can't be, it still looks like it's a compound assignment operator. Also a negative version would need to be something like !~=
which is very ugly.
I have a strawman implementation at https://github.com/cxw42/wren/tree/smartmatch if anyone wants to try it! I implemented it using a new SWAP
opcode for simplicity.
Example:
class Test {
construct new() {}
~~(needle) { 42 } // Note: `needle ~~ haystack` calls haystack.~~(needle)
}
var test = Test.new()
System.print(1 ~~ test) //> 42
I have not yet added any default implementations but will be working on those.
I proposed at the top for String
that x ~~ str
be str.contains(x)
(substring test). I just realized that won't work well with a switch
statement: switch("a") { case "bar"... }
shouldn't match just because there is an a
in bar
. I looked back at the Raku docs, and Raku's string smartmatch is equality rather than substring. For those two reasons, I have modified my https://github.com/wren-lang/wren/issues/989#issue-866834010 to suggest string equality.
While it makes a good start to toy with, but I still don't like the syntax of the declaration in the class. The only writing I see for now, would be something like:
(needle)~~(this) {...}
But that would require to change all unary operators...
I suspect this is because you want to test more to equality than substring (and this should be the same for every container/collection).
I think the String
problem is just a symptom that this operator is problematic: is serves too many purposes. If it's the "contained in" operator, then it should refer String.contains()
, and not have any implementation for Num
, for example. If it's the switch
operator, it should perform an equality comparison for strings and be implemented for almost all primitives. The fact that the symbol ~~
has no meaning in math (nor in mainstream languages), also indicates that this is an overly-used operator, so you can't give it a proper name.
Instead, I think we should think about splitting the roles. We can have an in
operator, and a case
or whatever-called switch
match operator. They're similar in the fact that they're both inverted (relative to the other operators), and thus need a CODE_SWAP
, but different in purpose.
I find https://github.com/wren-lang/wren/issues/989#issuecomment-830723064 particularly disappointing as I felt that sub-string matching was an important part of this proposal.
I don't think it's necessarily fatal to the original proposal as switch("bar") { case "bar"... }
would still have matched even then.
However, @ChayimFriedman2 may be right that it's best to split the roles though, if we split off containment, I'm not sure that this leaves much of a role for smart-matching as we can already do type-checking with is
and equality with ==
.
As far as containment is concerned, although it was my idea to reuse in
and despite objections I still think it's a plausible proposal, I wonder whether it would be better to come up with a new operator instead? I suggest the at
symbol, @
, might be the best choice of those still available. An advantage of using a symbol rather than a word is that we could then use !@
to mean not contained. Some examples to see how this would look:
var a = 2
var b = a @ [1, 2, 3] // true
var c = a !@ [4, 5, 6] // true
var d = 3
var e = d @ 4..8 // false
var f = d !@ 0..2 // true
var g = "a"
var h = g @ "bar" // true
var i = g !@ "baz" // false
I'm not sure whether I like this or not but I think it's worth considering.
Python uses in
, and probably other languages too.
Do you have an example of languages that uses an operator (preferably mainstream)? If not, the cognitive overhead will be probably too much.
Can't think of a mainstream language which uses a symbol operator for containment, TBH.
Apart from Python, Kotlin uses in
and !in
for containment though it's been a while since I used that language and I can't remember now whether those operators can be used with strings or not.
EDIT: They can in fact be used with strings. See here.
Although previously I'd shied away from suggesting a mixed symbol/word operator, !in
doesn't look too bad in actual use and, if we had that, we could also introduce an analogous !is
.
It's somewhat hard to lex. However, if in
becomes a keyword, we can do it. I prefer is not
and not in
, however.
Although I like is not
and not in
myself, they would require a new keyword which might rule them out.
Are you going to name your variable not
? I hope not (pun intended).
Might have used it as a method name but probably won't have been used much in the past.
I can't think of any meaning to this as a method name that is not covered by overloading !
.
I was thinking of a static method rather than an instance method though, having said that, I recall someone pointing out (possibly yourself?) that you can have static operators as well.
I did: #797.
Yeah, that's it. Strange but true :)
Might have used it as a method name but probably won't have been used much in the past.
I just added not
to my vendored Assert
the other day:
Assert.not(b.isEmpty)
I think that is more readable than using an operator... I don't think this alone is a good reason not to add it as a keyword though if that would have real overall benefits.
Good one 😃 Though I would use Assert.assert(!b.isEmpty)
. In BDD, though, this is useful: x.is.not.something
.
We certainly could split containment and switching, and there may certainly be a more Wren-esque way to do something like smartmatch. Some questions:
switch
match, I think there will be a fair amount of overlap between them. How much mental effort will it be for users to remember the differences?switch
, do we want user-defined classes to be usable as cases? If so, how would we support them?The fact that the symbol ~~ has no meaning in math (nor in mainstream languages)
It is true that not many languages have something like smartmatch yet. And mathematics has no need for a "do what I mean" operator :) . I personally think smartmatch is a very efficient way (only one new operator) to support a wide variety of use cases, and to give users more flexibility. It does take some getting used to, but once you do, it saves you (as a person writing in Wren) from having to remember when to use .contains()
, when to use in
, when to use .match()
, ... .
Though I would use Assert.assert(!b.isEmpty).
I come from a Ruby background where we have not
built into the core language, so it's stuck in my head. :-) I don't find !
difficult to read (as long as there isn't also double negation (!unlocked
) or "flipped concepts" ie - !open?
vs closed?
), but I definitely prefer not
. Ruby also has unless
so we can often avoid the need for negation at the operator level entirely.
So adding not
as a keyword would get no objection from me. Then I assume we could write something like:
Assert[not b.isEmpty]
Assert.assert(not b.isEmpty)
If we do add switch, do we want user-defined classes to be usable as cases?
I feel switching only on built-in Core classes seems useful (clean input processing, etc), but ultimately quite limited.
I may be wrong but when @ChayimFriedman2 suggested adding a not
keyword, I don't think he had in mind using it as a general replacement for !
but just to negate is
and (if we add it) in
, given that they're words rather than symbols.
Well we certainly don't have to make it more generic... to me being able to express a not in b
but not allowed to express not file.closed()
feels a little inconsistent... Why not:
a ! in b
ie, ! means not period.
I guess I'm personally unpersuaded by "they're words [already] rather than symbols" line of thinking, unless you're saying "words are better" or "words belong with words"... but then I'd point out file
and closed
and words also. :)
...but I'm not making a strong argument here, just providing my thoughts.
Although not file.closed
isn't a great example because I'd write file.open
instead of that, but I already lost that discussion elsewhere. :)
Well, for better or worse, Wren follows C in preferring to use symbols rather than words for operators. The only exception to this is is
and, if we re-designate in
as an operator, that would make two.
I actually agree with you that it would be inconsistent to introduce not
just to negate these operators and another aspect I don't like is that not
would follow is
but precede in
.
For these reasons I personally would prefer to use !is
and !in
as the negations of these operators even if they're a mixture of a symbol and a word. As you say yourself, we often use ! with an identifier so we're used to this sort of thing anyway.
@PureFox48 re. https://github.com/wren-lang/wren/issues/989#issuecomment-830781833 --- the plot thickens! I didn't realize that String
IS-A Sequence
. When I implemented ~~
for Sequence
, suddenly String
went back to substring matching :D .
Another way to handle the exact-match case would be with a temporary list: "a" ~~ "abc"
(substring), but "a" !~ ["abc"]
(exact match because it's list containment. That seems a bit too subtle to me, but it is an option.
I opened a draft PR with the full proposal from the top post, as edited, in case you'd be willing to give it a try and see how it works in practice! Str.~~(_)
does implement equality, not substring, in the current version of the PR.
I didn't realize that String IS-A Sequence. When I implemented ~~ for Sequence, suddenly String went back to substring matching :D .
Although I knew String
inherited from Sequence
, I also knew that (unlike List
and Range
) it has its own override of the contains
method which uses sub-string matching rather than testing that the string contains a single character.
I'd therefore assumed that this is what smart-matching would do as far as strings were concerned and that would include the possibility of an exact match.
However, what hadn't dawned on me is that we need to distinguish between sub-string and exact matching and that the Raku folks had concluded (as you just have) that the latter must win!
This inevitably means that this proposal has no easy way to do sub-string matching because, if you convert the string to a list, then it would only be able to match a single character. That would leave us with having to use a predicate function to provide this functionality.
So, whilst I think this proposal can still work (thanks for the draft PR), it's lost some of the attraction it had to me :(
(Background: I have been thinking about smartmatch a lot recently. e.g., https://github.com/wren-lang/wren/issues/956#issuecomment-817233866 and https://github.com/wren-lang/wren/issues/968#issuecomment-819148504 . I realized I should actually propose it for independent discussion! This builds on my https://github.com/wren-lang/wren/issues/968#issuecomment-819954076 in the discussion of
x in y
as an operator. Thanks to everyone participating in #956 and #968 for thought-provoking discussion! Thanks also to all the folks who have worked on smartmatch in Raku over the years.)Many programs at some point ask "does X have property Y?" or "is X part of collection Y?". For example:
I propose taking a page from the Raku programming language. Raku unifies these tests under a binary operator called "smartmatch".
Overview
Smartmatch is a binary operator, spelled
~~
in Raku.val ~~ thing
checks whetherval
matchesthing
. What "match" means depends onthing
. For example:1 ~~ 1.0
sincething
1.0
considers numerically equal values to match1 ~~ 1..10
sincething
1..10
considers all numbers in the range to match.~~
has the same precedence and associativity as==
.Every object has a special method that says whether a value matches that object. In Wren, I would use
~~(_)
, sofoo ~~ bar
would be exactlybar.~~(foo)
. Note: this may require a new opcode --- https://github.com/wren-lang/wren/issues/968#issuecomment-819762998~~(_)
does not have to return a boolean since all Wren values can be tested for truthiness.Advantages
Handles multiple use cases without having to change the syntax.
~~
could check for intersection.str ~~ regex
could check for regex match as a shorthand forregex.match(str)
.Permits users to customize behaviour for their specific programs from pure Wren code
Can test for list/sequence membership (#968) without risking confusion with
for
x ~~ [1,2,5]
is visibly different fromfor(x in [1, 2, 5])
Advantages when used with a
switch
statementSmartmatch provides a very clean way to express
switch
cases (#956). Each case can be the right-hand side of a smartmatch. That way you can have any case expression you want without having to special-case syntax to support complex conditionals. For example, inswitch(val)
:case 3: ...
would testval ~~ 3
, which I suggest be implemented asval == 3
.case [1,2,5]: ...
would testval ~~ [1,2,5]
, which I suggest test whetherval
occurs in list[1,2,5]
.Switch+smartmatch can support arbitrarily complex conditions using only Wren code. Programmers can define classes that implement
~~(_)
and encapsulate the conditions into those classes.Smartmatch is a great complement to
switch
statements. I think it would be useful even ifswitch
were not added to Wren. However, if you disagree, I certainly understand.Suggested implementations for
~~
A starting point for discussion.
Object.==(_)
. This also serves forBool
,Fiber
,Null
,Num
, editString
, andSystem
, and for optionalMeta
andRandom
.x ~~ SomeClass
===x is SomeClass
x ~~ fn
===fn.call(x)
. This allows functions to be used for complex tests.val ~~ Fn.new {|v| v>0 && v<=100 && v%2}
to test ifval
is an odd number between 1 and 99x ~~ map
===map.containsKey(x)
x ~~ seq
===seq.contains(x)
(element membership). This also serves forList
andRange
.String:; see https://github.com/wren-lang/wren/issues/989#issuecomment-830723064)x ~~ str
===str.contains(x)
(substring test)Edit per discussion below, adding
!~
which is just like~~
, but with the opposite result. I recommend thatFn.!~(_)
throw, since I don't know right now what that would mean.Implementation in the VM
I would add a
CALL_SWAPPED
opcode per https://github.com/wren-lang/wren/issues/968#issuecomment-820196805 . The same as regularCALL
, but it takes the arguments in the opposite order. That would permit~~
to be implemented without having to juggle the stack. However, that is only one of many possible options.Thank you for reading all the way to the bottom :D .