crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.45k stars 1.62k forks source link

Reusing AST nodes in macro `#each` #15008

Open HertzDevil opened 1 month ago

HertzDevil commented 1 month ago

Some yielding methods in the macro language like RangeLiteral#each allocate a new node on each iteration:

https://github.com/crystal-lang/crystal/blob/f17b565cd5cd9bd96dbba85d1e94c7d2cacbf21b/src/compiler/crystal/macros/methods.cr#L1115-L1125

This means {% (0...10000).each { } %} allocates 10000 NumberLiteral nodes. However, since all these variables have the same node type, and since Crystal::NumberLiteral#value= exists, it is actually possible to use a single node for all iterations:

block_arg = block.args.first?
block_var = nil

interpret_to_range(interpreter).each do |num|
  if block_arg
    if block_var
      block_var.value = num.to_s
    else
      block_var = NumberLiteral.new(num)
    end
    interpreter.define_var(block_arg.name, block_var)
  end
  interpreter.accept block.body
end

Thus if we know that the iteration variable won't be used outside the block, there should be a way to opt in to this behavior:

# prints 1, 2, 3, 4
{% (1..4).each(reuse: true) { |i| p i } %}

# probably unintended behavior
{%
  arr = [] of _
  (1..4).each(reuse: true) { |i| arr << i }
  arr # => [4, 4, 4, 4]
  (1..4).each(reuse: false) { |i| arr << i }
  arr # => [4, 4, 4, 4, 1, 2, 3, 4]
%}

This only makes sense for RangeLiteral#each and technically the keys for NamedTupleLiteral#each, because other similar methods like ArrayLiteral#each are heterogeneous or do not allocate new nodes per iteration. The main benefit of this approach is being able to define StringLiteral#each_char in a similar manner, since such a method is presumbaly avoided for performance reasons; and BytesLiteral#each_byte, if a new syntax is invented for #2886.

straight-shoota commented 1 month ago

Are there any relevant real use cases for RangeLiteral#each in singnificant sizes where performance/memory usage has a noticable effect? I somewhat doubt that.

I can see that something like this could be useful for iterating StringLiteral (and BytesLiteral). But maybe we should focus on those use cases specifically and not so much on RangeLiteral.