crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.45k stars 1.62k forks source link

[rfc] Import Object#yield_self from Ruby 2.5 #5110

Closed makenowjust closed 7 years ago

makenowjust commented 7 years ago

Ruby 2.5 decide to introduce Object#yield_self (see here). This method yields self to the block and then returns its result. The implementation is very simple:

class Object
  def yield_self
    yield self
  end
end

Should we import this method into Crystal? What do you think?

straight-shoota commented 7 years ago

It's basically Object#try which also works for Nil with a descriptive yet still uninviting long name.

RX14 commented 7 years ago

I don't really see the usecase..?

makenowjust commented 7 years ago

@RX14 Yes, I'd like to ask Ruby committer about usecase.

lbguilherme commented 7 years ago

Not that I defend it, but could be used to create a variable, without creating a variable. To help keep calculations inline:

result = long.chained.computation(12).foo(1, 2, 3).yield_self {|list| list.sum * list.product }.to_s.size

RethinkDB (a database whose query language is mostly functional) has a similar operation, called do: https://rethinkdb.com/api/ruby/do/.

It is like a map, but for the entire thing.

ysbaddaden commented 7 years ago

It's similar to .tap in behavior and intent, expect it returns the block's value (not the tapped value).

I don't see much usage. Naming is verbose, and assigning to a value feels better:

list = long.chained.computation(12).foo(1, 2, 3)
result = (list.sum * list.product).to_s.size
makenowjust commented 7 years ago

@RX14 Why do you close this issue? (I can guess but...) If you could explain this, you should do it, or if you couldn't explain this, you shouldn't close this.

RX14 commented 7 years ago

The consensus was that it wasn't really very useful. If anyone has a counterexample to discuss we can always reopen.

asterite commented 7 years ago

yield_self is for those that want a "pipe operator" in Ruby: http://mlomnicki.com/yield-self-in-ruby-25/

Like others here, I don't think it looks nice, nor I think it's readable. And since we have try (and you will probably never want to have a nil in a chain), it can work as good as yield_self.

makenowjust commented 7 years ago

I found very useful usecase of Object#yield_self! See this simple fizz buzz example:

def fizz?(n)
  n % 3 == 0
end

def buzz?(n)
  n % 5 == 0
end

(1..100).each do |i|
  if fizz?(i) && buzz?(i)
    puts :FizzBuzz
  elsif fizz?(i)
    puts :Fizz
  elsif buzz?(i)
    puts :Buzz
  else
    puts i
  end
end

When we have Object#yield_self, we can rewrite this:

# `def fizz?` and `def buzz` is skipped

(1..100).each do |i|
  case i
  when .yield_self { |i| fizz?(i) && buzz?(i) }
    puts :FizzBuzz
  when .yield_self { |i| fizz?(i) }
    puts :Fizz
  when .yield_self { |i| buzz?(i) }
    puts :Buzz
  else
    puts i
  end
end

In other words, Object#yield_self provides a potential to call any method in when condition against case value.

However, yield_self is tooooooooo long to type. So, I suggest another name Object#let (derived from Kotlin). What do you think?

@RX14 Please re-open this. I believe this feature makes Crystal more useful and we can discuss about this more.

RX14 commented 7 years ago

But why would you want to do that? That's uglier than the if version by far. If we just want a way to use arbitrary booleans in case then we should discuss that without resorting to hacks.

makenowjust commented 7 years ago

Real world example, src/compiler/crystal/tools/doc/highlighter.cr:

      case token.type
      when :NEWLINE
        io.puts

      # ...snip...

      when :IDENT
        if last_is_def
          last_is_def = false
          highlight token, "m", io
        else
          case token.value
          when :def, :if, :else, :elsif, :end,
               :class, :module, :include, :extend,
               :while, :until, :do, :yield, :return, :unless, :next, :break, :begin,
               :lib, :fun, :type, :struct, :union, :enum, :macro, :out, :require,
               :case, :when, :then, :of, :abstract, :rescue, :ensure, :is_a?,
               :alias, :pointerof, :sizeof, :instance_sizeof, :as, :typeof, :for, :in,
               :undef, :with, :self, :super, :private, :protected, "new"
            highlight token, "k", io
          when :true, :false, :nil
            highlight token, "n", io
          else
            io << token
          end
        end
      when :"+", :"-", :"*", :"/", :"=", :"==", :"<", :"<=", :">", :">=", :"!", :"!=", :"=~", :"!~", :"&", :"|", :"^", :"~", :"**", :">>", :"<<", :"%", :"[]", :"[]?", :"[]=", :"<=>", :"==="
        highlight token, "o", io
      when :"}"
        if break_on_rcurly
          break
        else
          io << token
        end
      else
        io << token
      end

when condition for keywords and operators are too long, so I want to refactor this with such constants:

KEYWORDS = Set{
  :def, :if, :else, :elsif, :end,
  :class, :module, :include, :extend,
  :while, :until, :do, :yield, :return, :unless, :next, :break, :begin,
  :lib, :fun, :type, :struct, :union, :enum, :macro, :out, :require,
  :case, :when, :then, :of, :abstract, :rescue, :ensure, :is_a?,
  :alias, :pointerof, :sizeof, :instance_sizeof, :as, :typeof, :for, :in,
  :undef, :with, :self, :super, :private, :protected, "new"
}

OPERATORS = Set{
  :"+", :"-", :"*", :"/", :"=", :"==", :"<", :"<=", :">", :">=", :"!",
  :"!=", :"=~", :"!~", :"&", :"|", :"^", :"~", :"**", :">>", :"<<",
  :"%", :"[]", :"[]?", :"[]=", :"<=>", :"==="
}

But I can't use these constants in when condition because Set#=== is not specialized currently.

When there is Object#let, this code can be rewritten:

      case token.type
      when :NEWLINE
        io.puts

      # ...snip...

      when :IDENT
        if last_is_def
          last_is_def = false
          highlight token, "m", io
        else
          case token.value
          when .let { |v| KEYWORDS.includes? v }
            highlight token, "k", io
          when :true, :false, :nil
            highlight token, "n", io
          else
            io << token
          end
        end
      when .let { |t| OPERATORS.includes? t }
        highlight token, "o", io
      when :"}"
        if break_on_rcurly
          break
        else
          io << token
        end
      else
        io << token
      end

Of course Object#let is not needed if #5269 is merged. However it is actual thing that Object#let is the most generic way to call any method against case value.

@RX14 Please imagine. Why do you hate this issue?

asterite commented 7 years ago

I think it makes sense. It would be nice to see which is faster, but I guess Set#includes? is faster than a huge when (I think this was concluded in a recent issue).

makenowjust commented 7 years ago

Benchmark is here and result is this:

old 358.88  (  2.79ms) (± 2.53%)       fastest
new 337.75  (  2.96ms) (± 1.87%)  1.06× slower

Object#let version is slow, but I feel it is a bit, also fast. It looks no problem to me.

/cc @asterite


And the most important point I think:

it is actual thing that Object#let is the most generic way to call any method against case value.

RX14 commented 7 years ago

It feels like an ugly hack, like there should be a concrete, real, syntax change for supporting this, not hacking it into the stdlib. Perhaps this could work:


case foo
when { func(foo) }
end
bcardiff commented 7 years ago

@RX14 that would require some lookahead in the parser to disambiguate from tuples I think. Besides the .foo { } is using already existing constructs.

Maybe Object#bind, Object#apply or Object#itself(&block) are short enough ?

The let looks weird to me. expr.let { |var| S } instead of let var = expr in S everything seems twisted.

asterite commented 7 years ago

Another proposal which doesn't require a change:

if func(foo)
  # do something
end
RX14 commented 7 years ago

Well, {func(foo)} is a valid tuple so clearly my syntax idea isn't sound. However, I maintain that if the only point of adding this method is for case then we'd be better off fixing the root of the problem - case. Can't think up of a good syntax though.

makenowjust commented 7 years ago

Object#into is possible candidate. (or maybe Object#in, but it is too short.) I think #apply or #itself is too long.

Sija commented 7 years ago

IMHO Object#yield_self isn't bad, but perhaps #tap! could be used?

makenowjust commented 7 years ago

#tap! does not make sense. I think ! means mutable or danger, but this method is not mutable normally and safe. Additionally #tap is used with mutable method sometimes.

makenowjust commented 7 years ago

Replacing with if is good sometimes, on the other hand, it is not so good sometimes. I think highlighter.cr is such an example. When normal when value condition and other method call is mixed, this method is really useful.

And another benefit of this method is to call any method in method chaining. Any Crystal users dislike this because they are genius, so they can think the best variable name every time. However I can't do it every time. Naming is important, but writing executable program is more important, so I like this.

ronen commented 6 years ago

Re Object#let, FYI there's a ruby gem with exactly that name, see rubygems object-let.

Disclaimer: I'm the author. Not that there's much code in there :)

sudo-nice commented 4 years ago

I abuse #try for that too. Most time it serves well, but sometimes feels like a hack: some_method.not_nil!.try....

elebow commented 2 years ago

It seems that, other than a caveat with Nil#try, Crystal's Object#try already has the same semantics as Ruby's Object#yield_self/Object#then, but the method names and documentation suggest different intended uses.

Therefore, I propose two things:

1. I encourage the team to rethink the decision against Object#yield_self and Object#then in the standard library. The previous decision in this thread was made several years ago, when yield_self and then were still new and strange in Ruby. Now in 2022, they have become well-established parts of the language, and there have been several independent proposals to add these to Crystal referenced in this thread. See the two examples below of the expressiveness that these methods would allow.

2. If the first proposal is not accepted (or even if it is), the Object#try documentation should be amended to reassure users that it is an acceptable substitute for #yield_self/#then, and not hacky. The documentation could explicitly state that this is the Crystal equivalent of those Ruby methods, to improve discoverability of the feature. I can open a PR with suggested wording if this is indeed true.


Here are two real-world use cases for Object#yield_self and Object#then. The code has been genericized from a proprietary repo.

Suppose we have a web request handler that enqueues a background job and notifies the user of the result.

def enqueue_job
  # enqueues a background job
end

def notify_user
  # sends some JSON to the user's browser
end

def handle_request
  # ...

  enqueue_job.then do |result|
    case result
    when :accepted
      notify_user(:accepted)
    when :rejected
      notify_user(:rejected)
    else
      log(result)
      notify_user(:error)
    end
  end

  # ...
end

I find this construction in the functional-programming style nicer than storing result as a local variable. The block parameter is visually associated with enqueue_job, and the method's namespace is not crowded with an unnecessary variable.


A similar example, where yield_self is used to create an anonymous hash. This is neater than having an intermediate variable for the redacted attributes, and this also demonstrates a case where yield_self sounds better than then.

def handle_request(attrs)
  log_request attrs.redact_private_data
                   .yield_self do |redacted_attrs|
                     {
                       name: redacted_attrs["name"],
                       email: redacted_attrs["email"],
                       ip: request.remote_ip,
                       admin: current_user.admin?
                     }
                   end

  save_to_database(attrs)
end
straight-shoota commented 2 years ago

Disclaimer: The following is just my personal opinion on code style.

Looking at the examples by @elebow, I don't buy that this is a good way to express the code's intention.

The first example looks much better readable to me when you assign the value in the case expression: This makes it very clear that the value of enque_job is the target of the branching expressions and can furthermore be referred to as a local variable. There's IMO no need for a nested block. It just adds noise in my opinion.

def handle_request
  # ...

    case result = enqueue_job
    when :accepted
      notify_user(:accepted)
    when :rejected
      notify_user(:rejected)
    else
      log(result)
      notify_user(:error)
    end
  end

  # ...
end

For the second example, without looking into the exact semantics of the methods being called, just considering the structural shape, I think it does a bad job explaining what's going on. It's a chain of methods being called, with the last call receiving a block that constructs a hash. It's not very obvious that this hash is actually the return value of the entire call chain.

Consider that alternative implementation instead. It does exactly the same thing, with a lot less visual clutter and much clearer in expressing how the log request data is created.

def handle_request(attrs)
  redacted_attrs = attrs.redact_private_data
  log_request({
    name: redacted_attrs["name"],
    email: redacted_attrs["email"],
    ip: request.remote_ip,
    admin: current_user.admin?
  })

  save_to_database(attrs)
end

I think the major point is that I don't agree with this sentiment:

The block parameter is visually associated with enqueue_job, and the method's namespace is not crowded with an unnecessary variable.

IMO it's not worth going lengths just to avoid introducing additional local variables in the method namespace. Introducing an additional scope just for the sake of that doesn't pay out.

Maybe it could be an idea to think about finding a middle ground for limiting the scope of local variables in some situations. For example, a local variable declared in a conditional expression (such as case result = enqueue_job ...) could be scoped to the conditional clause. If you want the variable to be visible beyond the corresponding end, it can simply be declared outside the conditional expression: result = enqueue_job; case result .... I could see that this would make kind of sense. Not sure it's really a good idea, though. It could be surprising behaviour, definitely a breaking change and has limited benefit (wether the local var is visible beyond the conditional is not really a practical issue). So it's probably not worth the effort.

elebow commented 2 years ago

@straight-shoota, I agree your suggested code is neater than my examples. Still, I think the style has merit in some situations. See these other real-world examples:

As for a language change regarding the scope of local variables, I agree that would be detrimental. I would argue only for adding the methods to Object. Especially since the existing Object#try is so close! And, it is unlikely that the names Object#yield_self/#then would ever be wanted in the future for some other non-Ruby-compatible behavior.


(mentioning https://github.com/crystal-lang/crystal/issues/1388#issuecomment-977696004 to link a related discussion that is not yet referenced in this thread).

straight-shoota commented 2 years ago

I'm pretty sure that most of the examples you posted would actually be easier to read if they were written as independent expressions instead of as a chain with yield_self. IMO it introduces avoidable cognitive overload when the primary expression is transformed to something else in the manner of x = fetch_a.yield_self { its_actually_b }. 🧠 💣

#then would likely collide with promise pattern (e.g. https://github.com/spider-gazelle/promise/blob/0b2e527c0c9c4207456cad64bc374aa2c6a793fa/src/promise/resolved_promise.cr#L31).

Sure, #yield_self would be cheap to add and there's little chance of conflict. But at least I would like to be convinced that it is actually useful for code improvement. So far I have the expression that it's mostly used for making code worse to read. 🤷‍♂️ Of course, in the end it's still a decision of the developer whether to use it or not. But when it has such a bad impact as I'm observing, then I'd rather prefer to not give the decision (at least not from stdlib; ofc anyone can add that method in user code if they wish). Sorry for making this hard on you ;) I think this discussion is very good though, and I'm happy to approve this with convincing argument.

ysbaddaden commented 2 years ago

I had to search, but I finally found one case where it may be nice (maybe):

def key
  @key ||= begin
    cipher = OpenSSL::Cipher.new(DEFAULT_CIPHER)
    cipher.random_key.hexstring
  end
end

# self_yield allows to skip the begin/end block:
def key
  @key ||= OpenSSL::Cipher.new(DEFAULT_CIPHER).yield_self do |cipher|
    cipher.random_key.hexstring
  end
end
asterite commented 2 years ago

I still we should do it like Ruby and add a then method: https://ruby-doc.org/core-3.1.0/Kernel.html#method-i-then

Implementing that method is trivial! And it would solve all these scenarios:

straight-shoota commented 2 years ago

@ysbaddaden I don't quite understand. Why would you use either of those forms instead of just a call chain?

def key
  @key ||= OpenSSL::Cipher.new(DEFAULT_CIPHER).random_key.hexstring
end
Sija commented 2 years ago

Related #7188 #11487

ysbaddaden commented 2 years ago

@straight-shoota :facepalm: ...

Allright, let's imagine there is some other cipher.something calls with no chaining involved? I usually solve it by extracting a method, but it may be nice to not have to create yet-another-method?

def key
  @key ||= generate_key
end

private def generate_key
  # ...
end

@asterite As said above #then may collide when designing futures, but maybe that's not a problem, since a hypothetical Future#then would just override Object#then and the original #then becomes kinda moot, so :shrug:

grepsedawk commented 2 years ago

I still we should do it like Ruby and add a then method: https://ruby-doc.org/core-3.1.0/Kernel.html#method-i-then

Implementing that method is trivial! And it would solve all these scenarios:

  • People using try instead of then
  • People who want to have something like a pipe operator
  • Discussing these things over and over :-)

I agree, #then seems super idiomatic and explains the operation (and matches ruby, which is pretty normal for crystal)

In terms of examples, the reasons I love .then/.try is to build out composable conditionals like such: https://thoughtbot.com/blog/using-yieldself-for-composable-activerecord-relations