ozra / onyx-lang

The Onyx Programming Language
Other
97 stars 5 forks source link

Rename of `===` to `~~`, Extensions and Relation to `=~`-Operator #67

Closed ozra closed 8 years ago

ozra commented 8 years ago

This stems out of ideas floating for a while, and some prior discussion began here: https://github.com/ozra/onyx-lang/issues/24#issuecomment-197855835.

The final run down as of now is:

I think === is confusingly symbolized, especially to JS-coders (which basically everyone has to do today at some point, for one's own web pages if not else). Curlies bring your thoughts to "fuzzier ways", so let's try it out.

Shortly, the proposition currently is

The obvious downside for using ~= (reverse of pattern-match) is that all _= operators means x _= y => x = x _ y, where underscore here is any of the operators. So therefore ~~ will be primary suggestion. Also, less confusion with =~

Prior Art

Here are some snippets from Wikipedia article on mathematical symbols and the meaning:

Given the above, there is some decency to the === op, but to me all variations of "approximate" symbols are more along the lines of "fuzzyness" and conveys the relation test clearer, in my eyes.

The practical definition is: anything you expect to match in a branch/switch/match construct are matched with this operator (they in fact are used in these constructs). The peculiar one to remember is that for instance "fooo" ~= String is true, likewise 47 ~= Int, that is: a value categorically matches it's type. Of course, as mentioned /foo/ ~= "has foo, ok!", and the reverse.

The "are in the same equivalence class" touches close enough upon the current use (~) but a lone ~ has and further or modified uses are already in planning where it's terseness will be needed.

I keep gravitating back to ~= and !~= for the negated version. But there are those snags as mentioned. The primary alternative then: ~~ and !~~.

All in all - this is really nitpicking - you almost never use === (or ~~) in code. But the idea is here to also leverage it for rex compares when results are not needed, etc, which can be utilized for optimizations in the future. If it turns out to be a bad idea (some ideas must! So far things have gone almost too well.. hehe) we'll just revert.

I count on some arguing on the choice here so it lands well.

stugol commented 8 years ago

Shortly, the proposition currently is Simply rename === to ~~

Since when has this been the proposition?

Also, less confusion with =~

Why not simply promote =~ to also do ==='s job?

a = "1"

a =~ /1/        -- 0    (normal regex result)
a =~ 1          -- true
a =~ String     -- true
a =~ true       -- true
a =~ 2          -- false

Abolish ===, and use =~ and !~ for all purposes.

We could use ~~, but there are issues with that. Firstly, we have to keep =~ alongside it, or consume it into ~~, in which case why not just keep the =~ syntax? Secondly, !~~ is a bit verbose, and breaks the "replace first = with a !" syntax we're all used to.

Whichever symbology we choose, it should be all-encompassing. If we end up going with ~~, for example, then ~=, =~ and === should all cease to exist.

ozra commented 8 years ago

The problem, as I elaborated on (albeit a bit lengthy), is that =~ is widely recognized as "regexp-match", it also, as you show, produce match results and gives the position as return value. The point of the wider "categorically, or definitionally, equivalent" is to always return either true|false - nothing else. That's why my initial throw-up idea of marrying them fell apart when I studied the use cases.

I agree !~~ is a bit off beat with the common pattern of negation. However - this one could double for both use-cases, provided only true|false is needed when regex matching negated? I would assume that would be an expected result. If so, I'll change !~~ to !~ and there won't be a need for a specific one mirroring =~, since the mirror function would return the exact same result. I haven't implemented negated regexp match yet, but it would be quick then (just one line in stdlib).

stugol commented 8 years ago

What's wrong with =~ returning true/false in some cases, and a regex-match in others? 0 is truthy, after all.

ozra commented 8 years ago

So the reasons are practical rather than failing on logic. I'll leave it open for re-consideration though.

I'm changing the !~~ to !~ in PoC-implementation to begin with.

stugol commented 8 years ago

What's a "PoC"?

I really don't see the problem. We're not changing the meaning of =~ when passed a Regex; so interoperability should be unaffected. No Crystal code should pass non-regexes to =~, so adding === functionality to it shouldn't cause any compatibility problems.

ozra commented 8 years ago

PoC == proof of concept.

The problem is for the ~~ case (===) since the operator is implicitly applied to all switchish compares (in Crystal and Onyx).

And, there's a perk I've forgotten to mention, complete side note, but related to the interoperability!

onyx, the compiler, supports compiling a crystal program, allowing you to write crystal as usual but using also onyx-libs / code. This allows a crystal-lover to code as usual, just require'ing Onyx code as any other code. Should someone later on favour going Onyx for their main code, I'll hopefully have my "onyxify" converter done soon (I'm working on it in a branch then and when, shouldn't be to hard to get done)

Essentially this means, if you wanna start to play, you can continue on your crystal code as is, using onyx as the compiler, and then write small parts in onyx where wanted (in their own files of course).

Onyx' "crystal knowledge" is usually not more than about two weeks behind Crystal main developments.

stugol commented 8 years ago

the operator is implicitly applied to all switchish compares

I still don't understand the problem. Would it help to simply declare an implicit === function on all Onyx classes that don't have their own, and then have all case-type constructs call ===?

class Whatever
   ===(arg) -> =~(arg)
   !==(arg) -> !~(arg)

Perhaps you could explain the problem to me a bit more clearly?

ozra commented 8 years ago

It all has to do with the reliability of the "contract" that the comparator promises Bool at all times. For consistency in such a delicate area as "implicit invocation for truthiness". Murphy's Law can be counted on to predict that havoc will wreck otherwise ;-)

stugol commented 8 years ago

Hm. I'm not convinced, but then again, I'm not qualified to write a compiler.

So you reckon ~~ and !~, then? And presumably, for classes that don't declare them:

~~(v) ->
   ===(v) || !!(v && v.is_a?(Regex) && =~(v))
!~(v) ->
   !(~~(v))

Does ~~ simply replace ===, conceptually? As in, no Onyx class should ever define ===?

stugol commented 8 years ago

Thing is though....if ~~ replaces === and =~, and returns true|false|int...and all conditionals use it....how is that any safer than usurping =~?

Or do you intend ~~ to always return true|false, and leave regexs out of it? In which case, aren't you simply using a different name for ===?

ozra commented 8 years ago

Haha, yes. That's where the "nitpicking" came in, since the final result of it all it pretty much as you say (as seen in the issue-OP - maybe I was unclear? Not to uncommon :-/ ).

As to the definitions you showcased above, it's pretty much like that: https://github.com/ozra/onyx-lang/blob/ca7a61a34f7fd8d410fd8d9f28fd068cc616f476/src/onyx_object_additions.ox#L3

https://github.com/ozra/onyx-lang/blob/490cb6f6556523db698af0e73ec63e5adb7c94c4/src/onyx_object_additions.ox#L4

And: https://github.com/ozra/onyx-lang/blob/ca7a61a34f7fd8d410fd8d9f28fd068cc616f476/src/onyx_regex_additions.ox#L5

These are in "PoC" added state, meaning they're awaiting refactoring to get better file placements etc. All PoC-implementations in this changes-all-the-time-phase are done rather swiftly. When something get more of a "will likely stay in"-status, I clean them up.

[ed: edited title to reflect issue better]

stugol commented 8 years ago

!~ is added

But !~ already existed. You mean you've extended its functionality, not added it.

You seem to have added an is keyword. How does that work?

$~ = the-match -- TODO this should be expected to not exist for future optimization purposes

Huh? You're abolishing $~? Or....just not setting it in ~~, maybe? I disagree with that decision, as I often do stuff like:

if text =~ /pattern/ && [ possible values ].include?($1)
  ...

So, given that the result of =~ is not needed, I'd now do:

if text ~~ /pattern/ && [ possible values ].include?($1)
  ...

So either ~~ should set $~, or I shouldn't use ~~ for regex matching.

ozra commented 8 years ago

But !~ already existed. You mean you've extended its functionality, not added it.

It wasn't in the lexer in Onyx, hence I added it, but I might have taken that out at some earlier point - dev has been to fast so I can hardly remember myself :-/. In any event: it's the opposite of ~~ (like != to ==), not the negation of ~=, but since ~~ does the Boolean only check of regex match - the negation has the same functionality as if it was for regexp.

You seem to have added an is keyword. How does that work?

I cut n' pasted this from #5:

and, or, is, isnt, not is available in addition to &&, ||, ==, !=, ! - they behave exactly the same as their symbolic counterparts, it's a mere matter of lexical choice. ed: Aim for clarity of intent.

Huh? You're abolishing $~? Or....just not setting it in ~~, maybe? I disagree with that decision, as I often do stuff like:

In a way I would like to abolish it entirely; I don't find it to be a good coding practice. I've survived without it in very regexp-heavy programs in many languages without ever wishing for such a horror ;-).

The way that "should be" as far as I'm concerned, is like the return value of exactly anything else in the world (in this case: match-op returns the match-object, or nil for no match):

if (m = text =~ /pattern/) && [ possible values ].include?(m.1)

But, it's not super important to me, so if it's liked, it's better it stays in. If I don't use it, I don't have to care B-) So that's that.

And you're right: you shouldn't use ~~ in your example, since you are using the result!. The result is accessed through the implicit $1.

The $_ would be a better candidate for implicit soft-lambda parameters (the _1/%1/@1 discussion), then at least there's a defined context for the implicits, but, anyway, that's that.

stugol commented 8 years ago

I like $~. Leave it alone! ;)

ozra commented 8 years ago

Haha, ok. I think it actually has its' uses in "script like" programs, and this is really one of those "don't use then don't worry"-kind of things that there is no real gain holistically from removing. So, keep enjoying it! :-)