ozra / onyx-lang

The Onyx Programming Language
Other
97 stars 5 forks source link

Feature suggestion: brittle tuples #24

Open stugol opened 8 years ago

stugol commented 8 years ago

I'm not sure what this should be called, but I think it's important.

Consider the following code:

the-function ->
    try
        do-something
        {true, [""]}
    catch e
        {false, e}

success, data = the-function()  -- great :)
success2 = the-function()       -- not so great :(

if the-function()               -- this is really bad!
    ...

In this example, success2 would contain a tuple, of type {Bool, Array|Exception}. But that's not very useful. Ideally, we want some way to return a tuple from a function, such that any values not assigned would be lost:

success = the-function          -- second value is lost
success, data = the-function    -- second value is kept

In both cases, the type of success would be Bool, which is what we want.

Maybe some kind of "brittle tuple" class, that inherits from Tuple, and cannot be assigned wholesale to a variable or passed to another function or control structure. And a special syntax:

the-function -> return {{ true, "" }}       -- returns a Tuple that must be destructured
thing = the-function()                      -- automatic destructuring: only gets first value

[edit / ozra: latest distilled conslusion of issue in https://github.com/ozra/onyx-lang/issues/24#issuecomment-250832621]

ozra commented 8 years ago

I'm not completely following here, but a primary theme as to mechanism seems to be "auto-discarded unused values in an implicitly induced multi-assign", did that make sense? And if so am I understanding correctly?

Otherwise as to the examples I can think of a lot of way of making the uses smoother without that.

Would you care to explain it or show it off some more ways so I get a grip on its value?

stugol commented 8 years ago

Pretty much, yes. We're talking about functions that return a status value, but may also return additional information that, quite often, we won't care about.

Consider a function that executes a shell command and returns a tuple, where the first value is a boolean ("did the command succeed?") and the other values are the textual output of the command (stdout, stderr output). For example:

puts execute("ls")                  -- {true, [".", "..", "file1", "file2"], []}
puts execute("no-such-command")     -- {false}

If we don't care about the output, and just want to know if it worked, we currently need to do something like if execute(cmd)[0] or maybe a destructure:

{success, _, _} = execute(cmd)
if success
    ...

But that's a pain. More importantly, suppose I forget? The following if is always true:

if execute(cmd)         -- bug!
    ...

It would be nice to be able to ignore the extra values if all we care about is some kind of success or status value. Similarly, we could use this for HTTP requests:

do-http(url) ->
    [some http query code]
    if [status-code is some kind of success value]
        {status-code, body}
    else
        {status-code, status-message}

if do-http(url)
    ...

Allowing a function to return a brittle tuple - one that must be destructured if you care about the other values - would [a] avoid bugs and [b] make calling them a lot simpler if you don't care about the extra results.

I suggest a {{ }} syntax for brittle tuples.

ozra commented 8 years ago

I really have to dive in to bed now, but can't resist quickly commenting: the concept is growing on me - interesting!

ozra commented 8 years ago

In response to https://github.com/ozra/onyx-lang/issues/56#issuecomment-196022346:

My first thoughts regarding the brittle tuple-example was the one you mention in above linked comment: Maybes. And then I thought, "why not just Value|Nil?". Well, because maybe (as per earlier example) you want some value with the "false result" too (would not be possible with either Maybe or Value|Nil).

I think the best approach would be to settle the "nil-helper-sugar", perhaps have a specific ?? ("truthy?"), just an idea:

if (x = return-something-or-maybe-some-expected-fault)??
   use-the-bona-fide-val x
else
   say "Didn't work out {x.message}"

Note that the actual returned value is assigned, then the "is it truthy" check is done on that for the condition.

?? operator would thus be (with specific user type as last example):

type Object: truthy() -> true
type Bool: truthy() -> self == true  -- true check redundant - just for clarity
type Nil: truthy() -> false
type MySpecialType: truthy() -> @value-is-ok

Note, just to remind on how expressions blocks in Onyx are formed, these are equivalent:

type Object => truthy() -> true
type Bool => truthy() -> self == true
type Nil => truthy() -> false
type MySpecialType => truthy() -> @value-is-ok
type Object
   truthy() -> true

type Bool
   truthy() -> self == true

type Nil
   truthy() -> false

-- elaborates a bit more in this example this time
--  - just one arbitrary way of utilizing it...
type MySpecialType
   @value-is-ok = false
   @some-value = Whatever()  'get
   @message = ""

   init(@value-is-ok, @some-value-or-message) ->
      if @value-is-ok
         @some-value = @some-value-or-message
      else
         @message = @some-value-or-message

   init(@value Whatever) ->
      @value-is-ok = true
      @some-value = @value

   init(_ Nil) ->
      @value-is-ok = false   -- redundant, just for clarity
      @message = "Some error"

   init(@message String) ->
      @value-is-ok = false   -- redundant, just for clarity

   truthy() -> @value-is-ok

Thus:

is-nil = nil
if is-nil?? => say "Won't happen"

is-a-val = MySpecialType Whatever 47
is-not-ok-val = MySpecialType "Shit happened"

if is-a-val?? => say "Yeay we got a val here: {is-a-val.some-value}"
if is-not-ok-val??
   say "Won't happen: {is-a-val.some-value}"
else
   say "As expected: {is-a-val.message}"

I like this approach:

Thoughts?

stugol commented 8 years ago

The inline ?? operator is fine, because it requires spacing; but the syntax you propose would conflict with functions that end with ?.

valid? ->

if valid???      -- wtf?

Why not simply allow a class to define whether it's considered truthy or not, and have that affect conditionals?

type MyType: truthy() -> false
fn() -> MyType()

if value = fn()
   say "this line will not be called"
else
   say value

I still think the <- operator is a good solution here:

type MyType: truthy() -> false
if MyType()                   -- false
   ...

value = Control<String>(:blah, "message")
case content <- value         -- matches against the first member, and assigns the second member
   when :blah
      say content             -- "message"

if content <- value           -- control value is truthy, so proceeds as above
   say content                -- "message"

Alternative syntax:

case value                     -- matches against the first member
   when :blah -> content
      say content              -- "message"
ozra commented 8 years ago

Yes the function? is a problem, this goes for syntax for the nil-sugar-notation also, as mentioned in #21. Still thinking about that one.

Making it implicit (for if-conditionals) is good idea. What directly comes to mind then is that if adds a "comparable" operation with true: if x => if x === true. Then it can be implemented via the === operator. case/switch/what-you-call-it already uses this operator implicitly.

So: type MyType: ===(val Bool) -> if whatever-condition-the-type-requires-is-true ? val == true : val == false

This reminds me, I think the === is confuzing, it should be ~== (also the "regexp compare", =~ should be ~= or ditched in favour of using that for === and implementing an alias to former =~ from ~= (former ===) - since that's what you'd expect anyway if making a case/match on a regexp.

Thoughts?

Regarding the <- operator suggestion: what's your definition of its operation in full, so I'm not mistaken?

stugol commented 8 years ago

=~ exists because !~ exists. It's a duality that I frequently make use of.

I agree, however, that ~== makes a bit more sense than ===. So....what you propose is to unify === with =~?

type MyType:
   ~==(pattern Regex) -> pattern.match(self)      -- or whatever code is correct for a regex test
   ~==(b Bool) -> !(some-condition ^ b)

I'm loath to break the =~/!~ duality, however. Perhaps simply extend =~ and !~ to encompass === functionality, and ditch ===?

type MyType:
   =~(pattern Regex) -> pattern.match(self)      -- or whatever code is correct for a regex test
   =~(b Bool) -> !(some-condition ^ b)
   !~(arg) -> !(self =~ arg)

Also, you should make !~ implicit if not defined, implementing it precisely as given above.

stugol commented 8 years ago

Regarding the <- operator suggestion: what's your definition of its operation in full, so I'm not mistaken?

First, the Control<T> type:

type Control<C, T>
   @control C
   @value T

And the usage:

if a <- b
   ...

Where b is a value of type Control<C, T>. Will assign a the T value of Control, but return the C value. For example:

fn(value) ->
   case value <- result
      when :continue
         say value
      else
         say "Stopping because " + value
fn Control(:continue, "Hi!")
fn Control(:stop, "I'm tired")

Output:

So nothing to do with brittle tuples, really. It's a different issue.

stugol commented 8 years ago

Brittle tuples are for allowing a function to return multiple values whilst allowing the caller to safely pretend it only returns a single value (the first value of the tuple).

Control values are for flow control; to replace break and next keywords in blocks or visitors.

ozra commented 8 years ago

With regards to the brittle tuple issue then, I think if implicitly calling === true (whatever the operator would be called if changed) on the supplied expression should suffice, so this could be closed, no? I foresee exactly zero problems with such a construct, and LLVM will optimize unnecessary parts (only gotcha is that it will increase compile time slightly).

For the other issue it would be better in a new GH-issue to keep things focused (and me not confused [got the spelling right ;-)])

But quickly put: It would be reasonable for the flow-control use case to simply return a desired value and otherwise, for flow control, to return for example Stop "message". This could of course be a lot smoother to handle with a true match construct, but is easy enough now too:

match x
   Stop => say x.message
   Break => say "{x}"  -- provided `to-s` is defined for Break [ed: corrected typo here]
   Warp => say x.to-s  --  -""-
   else => say "got a usable value"

Onyx still follows the pattern of Crystal that when a type is used as when-condition, it is assumed that one wants to compare it against the type of the matched expression, and not the value (also this is in fact implemented through the === operator).

stugol commented 8 years ago

You don't think actually implementing brittle tuples is a good idea, then? My reasoning is as follows:

fn ->
   <<#success, some-information-we-don't-usually-care-about>>    -- brittle tuple

result = fn()

We want result to equal #success, not a tuple; because remembering to destructure every damn call is unnecessarily wordy and easy to screw up. Essentially, what we want is optional return values, just as you can have optional arguments.

stugol commented 8 years ago

provided to-s is defined for Break

That would be to_s.

Warp

Wtf is Warp?

Break => say "{message}" -- provided to-s is defined for Break

How the hell is that supposed to work? Where is message coming from?

match x

I guess it's good enough that we don't need an explicit Control<C,T> class or the <- operator. We should improve the match support instead, then:

match x
   Stop message => say message
   Break message => say "{message}"  -- provided `to-s` is defined for Break

We'd need some way to define "tagged unions" - which is a good idea in and of itself.

stugol commented 8 years ago

Um...how come the match doesn't require when keywords? I don't recall this syntax being discussed. I mean, it's a great idea, but it's news to me ;)

ozra commented 8 years ago

You don't think actually implementing brittle tuples is a good idea, then? My reasoning is as follows:

I'll leave this open for now, for further discussion and to not dismiss it pre-maturely. The idea of "multiple return values", rather than a specific type per se, requiring destructuring as you mentioned earlier could be quite the idea. Then single assign for ret-val would not work, and would cause an error so that one is reminded of handling all the return values, which of course would be simple as indexable-destructuring (as if it was a tuple, even though it's considered "multiple return values" by the languge):

-- like this
result = fn()[0]

-- or (equal)
result = fn().0

-- or...
[result, ..._] = fn()  -- a bit wordier and uglier

-- the following will not compile - all return values are not handled:
result = fn()

Maybe though, multiple return values and obligation to handle could/should be separated. I've had this idea for "code-modes" (lenient, strict, etc.) where one can increase the demands. One of possible optional demands I've imagined would be to always handle return values. Non-use of value is made explicit via, say discard-keyword (inspired from Nim), or _ = the-fn() (terser and goes with throw-away syntax in other places). Another alternative then, would be to add a pragma for func-defs to demand handling of the ret value(s), or not. In that case x = y() should return only the first value - as you wrote it in you example. I think this needs some thinking still, there are great qualities in the ideas, it just has to click. A literal for "multiple return values" which is somewhat a "control flow" construction rather than a type then could still be tuple-like, like in your example: <<my, ret, vals>>. The return-keyword would not require such notation: return my, ret, vals

That would be to_s.

Nope. It would be to_s, to-s or to–s (or even toS, but there are many edges to that case, so it will be reduced to be available only for transpiling from other code with the humps to hump-free Onyx source)

Wtf is Warp?

Some arbitrarily chosen "control code", just to clarify that these are in fact arbitrarily chosen types, and not part of the language.

How the hell is that supposed to work? Where is message coming from?

Sorry about that, a sneaky typo. Should have been x, not message. Corrected now.

We should improve the match support instead, then:

A definite yes, a specific match-construct with destructuring.

Revisit #13 and search for "Switch" for a run-down of the multitude of keywords and styles of formatting switch/when/case/select/branch/code/whatever constructs :-D

I make a follow up reply in #13 regarding this question.

We'd need some way to define "tagged unions" - which is a good idea in and of itself.

Given the way one can easily work with union types (sum types) for variables, this isn't really that necessary. It's "kind of part of the type system".

Um...how come the match doesn't require when keywords? I don't recall this syntax being discussed. I mean, it's a great idea, but it's news to me ;)

Also, see #13 :-)

ozra commented 8 years ago

Regarding the =~ vs === etc., I'm opening a new issue.

stugol commented 8 years ago

single assign for ret-val would not work, and would cause an error so that one is reminded of handling all the return values

I disagree. I think it should be acceptable to only handle as many return values as you want. The errors creep in when you - currently - end up grabbing all the values as a tuple, rather than simply getting the first value:

fn ->
   <<1, 2, 3>>>

say fn()           -- should print "1", not "<1, 2, 3>".

You should be allowed to ignore return values; but avoid getting an unintended tuple value.

It would be to_s, to-s or to–s

Then x-y matches x_y and x-y, and also sometimes xY? That's useful. Although I think it should only match these other things when calling into Crystal code. If you define a function in Onyx called x_y, it shouldn't be callable as x-y, I reckon.

stugol commented 8 years ago

To clarify, a brittle tuple causes an implicit destructure. The following calls are equivalent:

fn ->
   <<1, 2, 3>>

a = fn()
[a] = fn()      -- if this is valid Onyx?
a, _ = fn()

There is no way to receive the tuple by calling the function. It is "brittle", and falls apart in transit ;)

stugol commented 8 years ago

Alternatively, if you want to preserve the tuple - say, if you're passing it on:

values = &fn()
say values     -- <<1, 2, 3>>

The & holds the tuple together and prevents it breaking up. It's explicit, and can't happen by accident. & makes sense because you're kinda getting a "reference" to the tuple, rather than spilling its contents all over the floor.

ozra commented 8 years ago

Then x-y matches x_y and x-y, and also sometimes xY? That's useful. Although I think it should...

Delimiting style is one of the most controversial among coders, and the one that still is hard to find good studies for. I believe therefor it's beneficial to allow them interchangeably. You can write a lib with function do-magic-stuff and some one hell bent on archaic underscores can still use it in their code with their coding style intact as do_magic_stuff. I have a hard time imagining the need for identifiers x_y and x-y to not refer to the same thing. Read more on the detail in #9.

To clarify, a brittle tuple causes an implicit destructure. The following calls are equivalent:

Yes, I fully grasp the concept, I'm just uncertain that it will be helpful reducing unintended bugs, instead of causing more bugs, which would be swallowing camels and all that.

[a] = fn() -- if this is valid Onyx?

Correct, this is the destructuring syntax for indexables in Onyx. a, b, _ = x is a syntax error (should be [a, b, _] = x).

stugol commented 8 years ago

Swallowing....camels?

ozra commented 8 years ago

Well, you know the saying, "sifting out the mosquitoes just to swallow the camels", or something - I translated from Swedish, maybe it's not a saying in the anglo-linguistic sphere? 8-/

stugol commented 8 years ago

Nope, we don't have that one over here. Not many camels in the UK.

ozra commented 7 years ago

Re-reading this again after some time.

The "multiple return values" concept (I prefer that over brittle tuple, indicating that it is indeed not really a type but a language construct), allowing just assigning one var (without multi-assign) is starting to make sense to me. If indeed all values were wanted, one will probably use the ret as if it was a tuple - in which case it will error immediately. And the "always true, although error" problem will of course not manifest, which was the target in the first place. What I'm still not certain about is explicit wanted "tupleization" of the return values. Perhaps simply re-use multi-assign with one splat-item: [...my-tup] = some-fun(). Also, I think return should be mandatory if returning multiple values: foo() -> return 1, "two", #three - instead of allowing a type-literal-ish construct - to clearly convey that it's a linguistic construct that has other ways of handing over the data rather then one return value of some magic type. Of course, under the hood, this will just be a tuple, but that's an implementation detail.

Needs further thinking yet. But as said earlier, there are definitely qualities in these ideas that should be made used of.

It's not open-and-shut easy to implement, because it affects code in call-locations depending on which function overload is chosen at inference time - but is fully possible to get done.

[ed: clarified some wording for better reminder-to-self value of content]

Sod-Almighty commented 7 years ago

Sounds about right.

ozra commented 7 years ago

Follow up to https://github.com/ozra/onyx-lang/issues/97#issuecomment-299814332

Follow up

(multiple-value-bind (success output errors return-code) (execute ('ifconfig'))
  (if (success)
...

Isn't exactly what I find terse though ;-) (Incidentally, btw, you've unbalanced the program by three parentheses in total, spread out; that's why I find significant space superior: that error is very unlikely to take place in Onyx.)

Parenthesis aside (no pun...).

Reflection

The non brittle tuples solutions we've discussed earlier are in the right direction of what is needed, imo: Maybe, Option, Control, <- operator, if true ~~ cond and using ~~ overloading, etc. With all these different variations on the maybe-a-result-or-an-error concept we're nearing a clean solution.

C++ Example

A summing up taking earlier discussed ideas and my current mindset into account:

Proposal for Onyx

We can do a lot better in Onyx, via for instance ~~ as discussed in the earlier comments in this issue (but a truthy?()/falsey?() system will be better), and one additional rule in type inference.

Let's begin with a good to have stdlib addition:

type Err‹T›
   @error T
   Self.truthy?() -> false  -- important! type inference will use this!
   init(@error T) ->
   ~~(v Bool) -> not v     -- `(Err(47) ~~ true) is false`
   to-s() -> @error.to-s
   get() -> @error

Then some user lib:

type Foo
   do-stuff() -> say "Yeay! I'm a Foo!"

maybe-true?() -> Random().next-bool

foo() ->
   if maybe-true?
      Foo()
   else
      Err "Shit happened!"

And finally, let's use it, mirroring the Cxx example:

if x = foo
   x.do-stuff
else
   say x
Sod-Almighty commented 7 years ago

you've unbalanced the program by three parentheses in total

Bah. This is what annoys me about Lisp. My editor does paren-balancing for me, but even it screws up occasionally. I've had to cut-and-paste entire code files more than once, to fix the parens.

And of course Github doesn't give me any help with Lisp.

*foo.do_stuff();

Nah, you mean foo->do_stuff(). What you have there is essentially *(foo.do_stuff()), due to precedence.

if x = foo

I don't like assignments in a conditional. It causes subtle bugs. What about some kind of special switch syntax? This way you don't even need the falsey? trickery:

try x = foo
   x.do-stuff
fail e
   say e

Or alternatively try foo => x.

(Incidentally, the "special switch syntax" for channels is the only good thing about Google Go. IIRC it lacks threads, overloading and generics, and is therefore utterly useless.)

Sod-Almighty commented 7 years ago

Thinking about it, it makes a lot of sense this way. It's exception handling syntax without the overhead of real exceptions.

ozra commented 7 years ago

occasionally. I've had to cut-and-paste entire code files more than once, to fix the parens.

Haha, yes, if I may qoute myself from the README: "Optimize for human readability (and writability) - not computers parsing (not lisp syntax uniformity). The compiler should work hard - not you!"

In all signal conveyance one adds protocols with bit-negation-doubling, check-sums or other means, in order to ensure the information was received, and understood, correctly. Whether one-wire bus protocol, RS485 over a km long cable, natural languages (human, animal and plant), or — computer languages! Lack of beacons and syntax redundancy, makes it next to impossible to say where the message went wrong (or worse: even not realizing that it did go wrong at all). Being able to pinpoint mistakes deterministically is important to keep silly mistakes... silly — and not anguishly painful to correct.

Nah, you mean foo->do_stuff(). What you have there is essentially *(foo.do_stuff()), due to precedence.

Good catch. Or equally fugly (*x).do-stuff().

I don't like assignments in a conditional. It causes subtle bugs. What about some kind of special switch

Me neither. The idea that arose from the prime-in-identifiers issue (#95) is because of that dangerousness of possible operator confusion, and I've come to use it all the time in paper code reasoning now. So I think it has potential as a goto feature in the lang: if (foo)' => use foo'. That is (some-expr.here.foo 47)' => ( some-expr-here-foo' = (some-expr.here.foo 47) )

The syntax is one of a few variations idea'd and that one has stuck the most for me.

Agree on golang! Throw those millions of backing this way instead!

exception handling...

Exactly! I find it very important to be able to write an exception free program, should one want to! There are places for exceptions, but those are really... exceptional. I feel the "return way" is to prefer for most situations - with clean semantics as those above described (or similar).

As a parenthesis, on a technical level, exceptions doesn't necessarily have the highest overhead. In fact they have none - as long as nothing goes wrong. That's one of the pros of them. But on the other hand, might-throw code-paths gives a worse context for certain optimizations for the compiler. The return-style has a constant overhead at each invocation site: returning an additional value (tag for polymorphically dispatching the right type) and a compare. In reality the final performance can sway either way.

I find it to be a better solution because of clarity. With branch-prediction, and the fact that the additional value is just pushed/popped on the stack likely makes for better optimization in the long run though. Touching cold memory is the most sluggish operation the consider - and that will happen when the exceptions throw - and when they're used as "substitutes" for a "check return value"-style (as Ruby, Crystal) then the hits start adding up.

Good support for both gives most choices to the programmer. That's the most important thing imo. If the hammer is your only tool, everything looks like a nail - and all that.

When it comes to the Onyx-CX implementation, the ? operator will unfortunately likely have to be used to achieve type narrowing: if (x = foo)? => x.do-stuff. In the longer term XL-solution that can be omitted — well: when it's not vaporware anymore ;-). Of course the dereference-operator as in the C++ example can be used instead. A hint is needed in one of the places for type-specification. Alternatively, also, with the auto-temporary syntax then: if (foo)'? => foo.do-stuff. It looks a bit weird when the idioms aren't part of the blood stream yet; but an even better syntax might pop up.

Most ideas so far has gone in the direction of becoming better and better, so :-)

Sod-Almighty commented 7 years ago

I still reckon my exception-handler syntax is the best option.

ozra commented 7 years ago

try is for catching exceptions. Having try also double as an alternative keyword for if just doesn't work out, and is rather confusing.

Sod-Almighty commented 7 years ago

It was an example. It wasn't meant to be taken literally.

ozra commented 7 years ago

Ok, I'm sorry, I might have misunderstood the idea. Care to post side by sides comparing so I get the essence?

Sod-Almighty commented 7 years ago

I'm not what you mean by "side by sides comparing", but what I mean is we should have a special keyword - similar to try - which implements the calling of a brittle-tuple-returning function and makes use of its result. A bit like Go's select statement for channels.

For example:

fn ->
  if some-condition
    true
  else
    (| false, some-error, some-context |)

say fn                  -- "true" or "false"

if fn
  say "I'm currently undecided whether `fn` returns a tuple or a boolean in this case"

trying fn => result                    -- similar to `if result = fn`, but discarding the first value
   say result           -- "nil"
fail the-output                          -- similar to `else`, but capturing all but the first value
   say "Failed: The output was: " + the-output

trying !fn => result
   say result           -- A tuple containing an error and a context
fail the-output
   say the-output       -- "nil"

result = fn             -- result is a tuple
(a, b, c) = fn          -- destructuring capture

It's essentially the same as if thing = function-call except with special keywords to [a] indicate multiple values are expected, and [b] draw attention to the variable assignment. When I see if a = b, I automatically see an expression, not an assignment. Using a special keyword in this case will avoid such ambiguity.

In fact, I propose banning assignments in expressions - or, at least, using alternative syntax, such as:

if a <- b == c
  say a         -- true
Sod-Almighty commented 7 years ago

Alternatively:

brittle-case fn
   when true => other-values
      ...
   when false => other-values
      ...