crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.24k stars 1.61k forks source link

[RFC] Pipe Operator #1388

Closed felixbuenemann closed 8 years ago

felixbuenemann commented 8 years ago

It would be great if Crystal had a construct similar to Elixir's pipe operator |>:

class Importer
  def import(io)
    xml = parse_xml(io)
    root = find_root_node(xml)
    tree = build_tree(root)
  end

  def import2(io)
    io |> parse_xml |> find_root_node |> build_tree
  end
end

I think that the latter example is much easier to read and requires far less eye tracking to comprehend.

The implementation in Elixir is simply syntactic sugar using a macro, not sure if Crystal's macros could do this transform as well.

This has been briefly discussed in #1099, but I don't think the arguments against it were valid. Crystal is just like Ruby a hybrid between an OO and functional language and I really like the funtional core, imperative shell pattern, which leads to well testable units by avoiding shared/hidden state.

ShalokShalom commented 3 years ago

I did that.

frnco commented 2 years ago

To really comprehend what this means, it would be helpful to show a ideally small example of code with pipe operator, explain what it does and what the equivalent implementation with current Crystal looks like. This was done in the OP and numerous other comments on this and related topics. Without explicit code, it's impossible to reason about drawbacks and benefits. Especially for people not familiar with fsharp.

I do agree that examples are important. Biggest Drawback is, quite obviously, language complexity. Still, macros are unquestionably harder to grasp than pipes, so it makes no sense to drop pipes due to their complexity and instead tell people to use macros.

As for benefits, they should be obvious in a language that encourages chaining. Pipes are just like chaining, with the twist that you can pipe the return value to any function that is able to take it as an argument, instead of only being allowed to call methods that said value responds to. To give a small example, imagine two scenarios: First, if numbers responded to .add and .subtract, and second having an Arithmethic module that can't be initialized (Thus can't be chained), and that has the same methods (add and subtract) but receives two arguments in each:

class Arithmethic
  def add(a,b)
    return b+a
  end
  def subtract(a,b)
    return b-a
  end
end
5.add(2).subtract(1) == 6
# Arithmethic without pipes, giving the same result (6):
number = 5
addition = Arithmetic.add(2, number)
subtraction = Arithmethic.subtract(1, addition)
# As One-liner:
Arithmethic.subtract(1, Arithmetic.add(2, 5))
# Arithmethic, now using pipes, again giving the same result (6):
5
|> Arithmethic.add(2)
|> Arithmethic.subtract(1)
# And this as a One-liner:
5 |> Arithmethic.add(2) |> Arithmethic.subtract(1)

In this example I did make the choice of having the pipe populate the last param in the list of arguments passed to the function, that's an arbitrary choice which I prefer but not everyone will necessarily agree with me on this, but this works for the purpose of displaying one of the ways it's dissimilar to chaining.

Piping is not a one-size-fits-all solution, and the same is true for chaining, and this is the reason why it makes sense to have both in a language. Piping also feels more readable to me than chaining when it's spread through multiple lines, but that's quite irrelevant.

Pipes are useful for building chains that cross boundaries between different object types, which becomes ridiculously obvious if you try to do things in a functional way, where you don't mess with return values but instead simply forward them. If you define your classes wisely you can completely do away with pipes in your code, it will be heavily object-oriented, and that's ok if that's what you want, but it's completely unreasonable to expect functional programming to work the same way, if you're doing functional, pipes are the norm. The maintainers can decide whether they want it in the language or not, but it's undeniable that there ARE benefits to having pipes in a language, and that it's something that programmers used to FP would like to have.

I just think everyone should be aware that, even though pipes are similar to method chaining, the two aren't interchangeable, and depending on who's writing the code and how they organize their code, not having pipes can be a pain. I do like FP but I'm actually quite used to OO after over a decade working with Ruby, and I have no problems with OO, so I'm just refactoring stuff whenever I realize something's getting "too functional" and the lack of pipes could turn into a problem later on, but some people may fail to realize it or simply find it disagreeable to do things this way.

Crystal is multi-paradigm, and many multi-paradigm languages include pipes. Yes it's syntactic sugar, mainly appealing to functional programmers, but in my experience it goes really well with type-checking and macros, and seems really weird to me that a language that has macros, like crystal, doesn't have some stuff that is way more basic, like Algebraic Data Types and Pipes. I can actually understand Algebraic Data Types but Pipes? Seriously?!?

Well, as I said, it's up to the maintainers, but it's ludicrous that this even needs discussing. When it's ok to have something as complex as macros, opposing the addition of Pipes based on language complexity is so completely nonsensical that I can't describe it in any way other than (I'll apologize in advance for the words, as I believe not one person involved realizes they were making excuses) a bullsh*t excuse. "Because I don't want it" is a lot better as a reason, and is actually perfectly ok for the maintainers to make decisions like that, and I do believe this is the truth at some level, but if that's the case I urge you guys to check if everyone agrees and, in that case, just state that "it's not gonna happen on grounds that we don't want it". Because there are people who would benefit from it, and the added complexity is insignificant, even more so when compared to macros, the types system and many other features that Crystal already has. If the actual reasoning is "don't wanna" or "we wanna encourage using objects and chaining methods", the developers should be given that information, so they can decide how they wanna deal with it.

asterite commented 2 years ago

I think your argument is fine except that your examples are made up. In fact, I couldn't find any real compelling examples in this entire thread.

asterite commented 2 years ago

There's a bit more to what I said above (which might have sounded a bit rude)

In languages where the pipe operator exists, it exists because the language and standard library exist in a way that nicely integrates with it, mainly because it's the only way you can use it.

In Elixir, Haskell, Elm, etc., all functions are global: unlike in OOP languages, there's no "receiver". That means that to use map from List you do List.map list func, to use sum from Enum you do Enum.sum enum, etc. (I know you can write some imports to avoid the prefix, but it's still a function that receives the object)

In Crystal there are very few such functions. For instance, maybe we can do:

"filename" |> File.open

That's nice!

But when you want to iterate over the file's lines you need to do...

("filename" |> File.open).each_line do |line|
  puts line
end

there's no nice way to further use the pipe operator.

Well, we could introduce a global method IO.each_line:

module IO
  def each_line(io, &)
    io.each_line do |line|
      yield line
    end
  end
end

and then use it like this:

"filename"
|> File.open
|> IO.each_line do |line|
  puts line
end

But, you see, we had to introduce an extra method that didn't exist before to be able to further use the pipeline.

If you code a bit with Crystal you will find that there are very few cases where the pipe operator can be used in a single expression more than once. Maybe you can design your entire shard's API to fit the pipe operator, but it would not end up being very idiomatic... why use all global methods when you can have instance mehtods?

Even the Arithmetic example is pretty contrived! Why would you write:

5 |> Arithmethic.add(2) |> Arithmethic.subtract(1)

when you can write:

5.add(2).subtract(1)

or even:

5 + 2 - 1

There's also a thing to be said about the verbosity of the last expression compared to the one that uses pipes.

Fryguy commented 2 years ago

I think the utility of the pipe operator doesn't come from these "global function" examples, but from when you are coding things up in a script or within a class and trying to break up a problem into smaller units.

For example, let's say I have a class that opens an XML file, parses it, transforms the results in multiple ways, then spits out some json. I might write it like this:

class Foo
  def initialize(@file)
  end

  def process
    xml = parse_file
    processed = manipulate_xml(xml)
    processed = manipulate_some_more(processed)
    processed.to_json
  end

  # ... 
end

I write it this way because each of those methods are nice and small. Collapsing that process method, some would do:

  def process
    manipulate_some_more(manipulate_xml(parse_file)).to_json
  end

but that's hard to read because it reads from the inside out and doesn't really demonstrate the flow, however the following would be easier to read.

  def process
    (parse_file |> manipulate_xml |> manipulate_some_more).to_json
  end

Admittedly, it's a contrived example (and could likely be written differently maybe with some instance variables or some other construct) but hopefully it gives an idea where this type of operator would shine.

jgaskins commented 2 years ago

@Fryguy Is that meaningfully different from using a chainable object1? My assumption based on your example is that the manipulate_* methods are private, so I would probably end up doing an extract class refactoring and create a private struct XMLModifier that groups all those methods together into common subsets of functionality:

class Foo
  def initialize(@file : File)
  end

  def process
    XMLModifier.new(XML.parse(@file))
      .manipulate
      .manipulate_some_more
      .to_json
  end

  private struct XMLModifier
    def initialize(@xml : XML::Node)
    end

    def manipulate : self
      # ...
    end

    def manipulate_some_more : self
      # ...
    end
  end
end

1 Basically, all the methods return self or another instance of self.class the way Enumerable does so you can call enumerable.select { ... }.map { ... }.etc

Fryguy commented 2 years ago

Sure, you could extract to an OO paradigm (and that was what I expected would be the criticism of my contrived example), but functional paradigms tend to work better for some domains, particularly when breaking up big chunks of code into smaller composable units. The overhead in creating a chainable API from those small functions is, IMO, generally not worth it for that kind of task.

I do like the way you created that inline struct though 😊

j8r commented 2 years ago

This works, using class methods: serialize_json more_manipulation manipulate XML.parse file

That's because the compiler parses method calls right to left, from my understanding. This example is essentially like pipes, but reversed and with spaces as separators. This can be done only when taking single arguments, otherwise parenthesis are needed.

Of course we usually read most sentences left to right, this can feel less intuitive (or may not for those used to right-to-left languages). Having to switch is definitely not ideal.

Fryguy commented 2 years ago

Yeah that spaces version is very similar to using parens, and gets immediately harder as complexity changes even a little (specifically with more params)... picture something like the following:

reticulate_splines(calculate(manipulate(parse_file(file), flarp: true), max: true))
# vs 
parse_file(file) |> manipulate(flarp: true) |> calculate(max: true) |> reticulate_splines

The former is so much more complicated to understand than the latter, that I almost never code it that way and instead use lots of indenting or temporary variables for readability:

reticulate_splines(
  calculate(
    manipulate(
      parse_file(file),
      flarp: true
    ),
    max: true
  )
)
# or
parsed      = parse_file(file)
manipulated = manipulate(parsed, flarp: true)
calculated  = calculate(manipulated, max: true)
reticulated = reticulate_splines(calcuated)
# or even
result = parse_file(file)
result = manipulate(result, flarp: true)
result = calculate(result, max: true)
result = reticulate_splines(result)

But even those are kind of hard to read, being so verbose, when compared to the pipe version. There are downsides to these too such as when you want to inject something into the middle, and you have to remember to rewire everything. Luckily that is usually easier in Crystal than Ruby, because the compiler has your back and sometimes does not let you pass in the wrong thing.

Admittedly, this example could also be converted to OO, but it's becoming more complex because the intermediate states are not necessarily the same data type (unlike the @xml example previously).


Also, I know this is an old issue, but I appreciate entertaining the debate again, with new people joining the community over the years, bringing fresh insights from other languages and experiences. This one, in particular, I've always wanted in both Crystal and Ruby. 😊

frnco commented 2 years ago

Even the Arithmetic example is pretty contrived! Why would you write:

5 |> Arithmethic.add(2) |> Arithmethic.subtract(1)

when you can write:

5.add(2).subtract(1)

Yes it is. The example is having Arithmethic as a module, and I intentionally showed the two different possible ways to do it, chaining makes a lot more sense from an OO-perspective, and I wanted to show what's the fundamental difference between the two approaches. As I said, for people used to OO chaining makes a lot more sense than pipes, and honestly at first I had a lot of trouble with Elm and Mint precisely because of Pipes. But once you get so accustomed to using pipes that they "click" in your head, this opens up a new approach to code structure, and even though I'm used to OO and chaining methods I have a couple times realized I put myself in a tricky situation because I wrote my crystal code in a slightly more functional way, and not having pipes meant I had to either embrace the messier code to use those modules, or rewrite them to make it sufficiently OO.

But it's also important to keep in mind that pipes are not a silver bullet. They're a tool, and a tool that appeals mostly to programmers writing code that favors the functional paradigm. I wouldn't expect a language like Python to include pipes because Pythonists believe in "one true way to write code", the so-called Pythonic way, and thus it makes sense to not have the option of using Pipes. But Ruby usually makes it easier to write code in the way the dev feels most comfortable, and that means having Pipes for people who like it. Isn't that the main reason we have block and no-block variations for a lot of methods? Pipes are all about allowing devs to use a different approach, a more functional one. And no, they're not magic, theyre just something that makes sense if you're trying to build code that doesn't mutate the values, code that doesn't rely on sending messages to objects but, instead, relies on passing values as arguments to one function, then passing the return value from that function to the next and so on.

@Fryguy Is that meaningfully different from using a chainable object1? @jgaskins the biggest difference I see is in the approach being OO or Functional, as I mentioned above.

Thing is, code can be structured to make chaining methods easier, creating and/or extending classes to have the methods return an object (self) which responds to methods. But if you're trying to write some self-contained code that doesn't extend anything and modifies values from multiple different classes, it's a lot easier to write a module with a few methods and pipe between them, to have a more reasonable example, imagine you want to fetch some URL using HTTP, then parse the body as HTML using something like Lexbor, then have a module that takes the Lexbor return value and gets the data, using different functions depending on the data you need. Currently this means something like:

response = HTTP::Client.get(url)
lexbor = Lexbor::Parser.new(response.body)
scrape_data(lexbor)

(I included the parenthesis because Github doesn't feel as nice to read as my editor and it felt easier to read this way, but they could be removed) Or as a one-liner:

scrape_data(Lexbor::Parser.new(HTTP::Client.get(url).body))

Which seems quite convoluted, I believe. Using pipes this could be written as:

HTTP::Client.get(url).body
|> Lexbor::Parser.new
|> scrape_data

Which is still not ideal, ideally I would like to write something like:

url
|> HTTP::Client.get
|> get_body # Maybe could be something like HTTP::Client::Response.get_body
|> Lexbor::Parser.new
|> scrape_data

And this seems a lot easier to read and to understand, plus since I don't allocate variables this should be a bit easier for the compiler to optimize and might even result in some marginal performance improvement, which may be negligible here but depending on the situation could become relevant, especially if the same is true for many parts of the code. This also makes it easier to understand the order of all that's happening, I need a url to fetch, then I get it, which gives me a response, then I get the body of said response, then I parse that body, then I scrape the data.

But Crystal is NOT Funtional, se I'd have to write the get_body function myself to do that, possibly in my own module, or maybe extend Lexbor to take a response instead of just a body, I guess that'd be preferrable.

Now because I just need to be the Devil's Advocate, this could also be written in an OO manner, which could allow me to write something like:

Scraper.new(url).fetch_data.get_body.parse_with_lexbor.scrape_data

but this means structuring my code in a completely different way, I'd need to instantiate objects, keep track of stuff using instance variables, possibly I'd have multiple classes for different steps of the process, meaning I'd have to allocate memory to instantiate all of those objects, and/or keep changing said object(s)... That's how OO works, I can accept that, but I can't see the harm in allowing programmers to use a functional approach, except for the maintainers not wanting to.

The main point here is that Pipes biggest advantage isn't modifying some object, it's working with boundaries, bridging the gaps between different modules. My first example, the Arithmethic class is really a bad example of how useful pipes would be in Crystal because it focuses on the differences between each approach and didn't convey the part about the current API we have and how moving values between different classes can be messy, this example shows this other side, which seems more appropriate to convey how pipes improve real-life code, when using the current API and shards we have available. And sure, the shards and API could just be improved to make this example as insignificant as the Arithmethic one, but the point is precisely whether it's worth doing that, and whether that would result in something simpler than just adding pipes to the language.

In the end preference plays a big role in this discussion. There's nothing wrong with fully embracing OO, and there's nothing wrong with Crystal choosing that Path. I just think it doesn't seem like a good idea for Crystal to follow that path, considering it's goal of providing a Ruby-like experience with more powerful features. On the contrary, pipes look like exactly the kind of thing that would benefit Crystal, making it easier for programmers used to functional code to work with Crystal code and, hopefully, getting more people to realize the awesomeness of Ruby syntax and adopt Crystal.

jgaskins commented 2 years ago

I can't see the harm in allowing programmers to use a functional approach, except for the maintainers not wanting to.

@frnco "The maintainers not wanting to" is all that's needed. The core team is a small group and they need to guard their energy. Their continued support of the language depends on them saying no when they do not feel they have the time or energy to maintain something they didn't want in the first place. Everything they say yes to becomes a maintenance burden on them and none of us want them to burn out over it.

They say no to a lot of things. They've said no to me a lot, too. Hell, they say no to each other all the time. And I get it, it's not fun to be rejected, but that allows them to say yes to other things that they feel are more impactful and focus on those things. And ultimately, that has resulted in a more cohesive language than it would have been if they accepted everyone's ideas.

asterite commented 2 years ago

Yes, it's mainly what @jgaskins says. I actually code in Elm and Haskell in my workplace and I use the pipe operator, a lot. But as a language designer in Crystal, whenever you introduce a new feature or syntax you have to think about how it interacts with everything else. What's the precedence of this new operator? How it combines with blocks? How does the formatter need to change? How do we document it? Where do we use it in code examples, and what are the recommended guidelines for using it?

This particular feature is just syntax sugar, allowing you to do things you can already do, just with a different syntax that, in my opinion, doesn't justify all the actual work needed to support it.

I think we should simply add the then method introduced by Ruby some time ago:

https://til.hashrocket.com/posts/f4agttd8si-chaining-then-in-ruby-26

It solves the same problem but the implementation is extremely simple, even more so in Crystal.

straight-shoota commented 2 years ago

Expanding on the previous comments: There is some cost associated with adding any new feature. Not just the plain initial implementation (which might be contributed by a proponent), but there's a lot to design about integration into the language, as well as long-term maintenance. Also not to forget, every feature of a language is something that it users need to learn or at least know about. Even if you wouldn't use a pipe operator yourself, you need to be prepared to see it in someone else's code. So that's more work for learners and teachers of the language.

I personally have little experience with functional languages. But I can definitely see the appeal of a pipe operator. I'm sure it would be useful.

Sill, I'm not convinced it has such a big impact in Crystal (compared to other languages) that it's worth it. That's obviously just my subjective assessment (and it's not set in stone). But it means that I won't commit to this idea.

asterite commented 2 years ago

Here's how you would use then:

class Object
  def then
    yield self
  end
end

__FILE__
  .then { |filename| File.read(filename) }
  .then { |string| string.lines }
  .then { |lines| lines.first }
  .then { |line| puts line }

I know it doesn't read as nice as the pipe operator, but it's actually more general than the pipe operator, and there are no hidden arguments.

Or... let's try to translate the above to use the pipe operator:

__FILE__
  |> File.read # So far so good...
  |> # oops...

Or maybe we would could combine the pipe and the dot:

__FILE____
  |> File.read
  .lines
  .first
  |> puts

I think that's starting to get quite confusing. Or maybe like this...

__FILE____
  |> File.read
  |> &.lines
  |> &.first
  |> puts

with|> &. meaning "invoke the give call on the receiver, don't pass it as a first argument"

but I don't know...

Another nice advantage of then is that you get to name what's the actual value/type of each intermediate step, something that you don't get with the pipe opeartor.

j8r commented 2 years ago

There are already try(&) and tap(&), then(&) looks like to be a good idea!

Fryguy commented 2 years ago

@asterite I like the .then syntax. It's a little verbose but I agree with you that it prevents the hidden parameter, and can give a meaningful name if needed. Additionally it allows for placing the value into any position, and not just the first one.

If we ever adopt the Ruby numbered parameters syntax (https://github.com/crystal-lang/crystal/pull/9216), this could trim it up as well, e.g.:

__FILE__
  .then { File.read(_1) }
  .then { _1.lines }
  .then { _1.first }
  .then { puts _1 }
asterite commented 2 years ago

Yeah, instead of numbered parameters I proposed we use Elixir's way:

__FILE__
  .then &File.read(&1)
  .then &.lines
  .then &.first
  .then &puts(&1)
straight-shoota commented 2 years ago

Just thinking aloud: What if then was called, say |>? =)

__FILE__
  |> &File.read(&1)
  |> &.lines
  |> &.first
  |> &puts(&1)

Maybe it would be too confusing from a Haskell pov, but it would come pretty close to the original intention, without introducing a new disruptive concept. Forwarded parameters for short block syntax is mostly an evolution of an already existing feature.

asterite commented 2 years ago

Yes, I was actually going to suggest that, but I wasn't sure if it was too extreme :-D

Fryguy commented 2 years ago

I was thinking the exact same thing @straight-shoota . |> with numbered parameters would be awesome.

docelic commented 2 years ago

Wouldn't then be a more Ruby/Crystal-friendly way of naming things, compared to |>?

frnco commented 2 years ago

@frnco "The maintainers not wanting to" is all that's needed. The core team is a small group and they need to guard their energy. Their continued support of the language depends on them saying no when they do not feel they have the time or energy to maintain something they didn't want in the first place. Everything they say yes to becomes a maintenance burden on them and none of us want them to burn out over it.

Absolutely agree. Which is why I keep repeating that this is the one reason that is absolutely unquestionable.

This particular feature is just syntax sugar, allowing you to do things you can already do, just with a different syntax that, in my opinion, doesn't justify all the actual work needed to support it. I strongly disagree with this argument, though pretty much everything else said by @asterite is absolutely spot-on, but reducing pipes to the level of "syntax sugar" is naive, to say the least. Pipes are a feature, a pretty niche one, sure, but it's still more than just sugar. Though that doesn't mean it's sufficiently relevant to justify the added burden to maintenance.

This discussion did get me curious about how this could be implemented using macros, though. My experience with Macros is quite small and mostly on other languages, but if Crystal Macros are actually powerful enough to allow implementing the pipe operator, then that'd be a pretty good way to deal with this, possibly even using the proposed then method, meaning the macro would only need to implement the pipe operator, using then to make it work. Seems like a really elegant solution that would ultimately make everyone happy. I did take a look at the documentation for Crystal macros but didn't figure out how I'd go about implementing an operator, or even whether it'd be possible, any pointers on how one might go about implementing a new operator in Crystal would be greatly appreciated.

asterite commented 2 years ago

There's no way to define custom operators in Crystal. And then, operators can't be macros because operators are instance methods, and macros are global or class methods.

When you mentioned macros I initially thought it was a good idea, but then ai realized it's not possible.

frnco commented 2 years ago

There's no way to define custom operators in Crystal. And then, operators can't be macros because operators are instance methods, and macros are global or class methods.

When you mentioned macros I initially thought it was a good idea, but then ai realized it's not possible.

Using LISP as a reference is definitely something I should be more careful about, most languages are nowhere nearly as flexible as lisp. Still, being able to implement operators would be a nice thing. Still, having a more rigid syntax is pretty much the standard outside the LISP-world so it's not surprising that Crystal is like that, as sad as that may be.

Still, that's actually another argument for making the Pipe operator available in the standard Crystal syntax. Or maybe it'd be better to just make it so macros are able to introduce and/or modify operators. That's not something many people would think about, yeah, but again, considering Crystal aims at being an improved Ruby, and seeing as Ruby started off with the intention of bringing the power of LISP to the masses while also having Smalltalk's ability to model real-world data as objects, (Objects make a lot more sense than functions when modelling real-world data, after all), it just makes sense to make Crystal more flexible.

crysbot commented 3 months ago

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/is-there-anything-like-pipe-operator-in-crystal/6773/2