What is the interface to get an AST representation for given crystal code?

mbj commented 10 years ago

I'd love to write something like https://github.com/mbj/mutant and https://github.com/mbj/unparser for crystal. Actually I expect I could reuse lots of code.

Both tools highly depend on the ability to do AST transformations.

asterite commented 10 years ago

Hi Markus, welcome to Crystal! :-)

I think this should be relatively simple to do in Crystal because the compiler (lexer, parser and compiler) are written in Crystal, so they will always be in sync with the latest version of the compiler (compared to having this as a library solution).

Here's a code to do the following: parse a string containing some code (here it's just two "def"s) and apply two transformations:

Change all number literals to strings containing "Number" followed by that number
Append "_changed" to all "def"s names

require "compiler/crystal/parser"
require "compiler/crystal/transformer"

include Crystal

parser = Parser.new "
  def foo
    1
  end

  def bar
    2
  end
"

nodes = parser.parse

puts "Before:"
puts nodes

class MyTransformer < Transformer
  def transform(node : NumberLiteral)
    StringLiteral.new("Number#{node.to_s}")
  end

  def transform(node : Def)
    # Make sure to call super so every node contained in this Def is transformed as well
    super

    node.name = "#{node.name}_changed"

    # Must return the transormed node
    node
  end
end

transformed = nodes.transform(MyTransformer.new)

puts "After:"
puts transformed

Using "to_s" is enought to get a textual representation of the AST nodes. Running it on my machine yields this:

asterite-manas @ ~/Projects/crystal (master) $ bin/crystal foo.cr --run
Before:
def foo
  1
end
def bar
  2
end

After:
def foo_changed
  "Number1"
end
def bar_changed
  "Number2"
end

The AST nodes are all here (undocumented, but most are self-explaining).

You can find the source code for the parser, lexer and everything else in the repository (only a handful of methods don't have source code because they are primitives, like memory allocation, class instantiation and arithmetic operations). They are not documented, so don't hesitate to ask how to use them.

Just a small note: maybe it's better to ask this question in our Google Group because members might receive a notification for this and they can help us reply and contribute with ideas. But GitHub is also fine (specially because of the syntax highlighting in code blocks ^_^).

mbj commented 10 years ago

@asterite Thx for this information! I'll close this issue as you perfectly answered my question. I'm a github workflow addict, not that good in communication via mailing list. Will stay here and keep you posted. I'm planning a semi automatic 1:1 transform from mutant source code to crystal. Just to learn if it is possible. Our YARD docs carry enough type hints to do this. //cc @dkubb

Edit: Your => Our (type hints).

asterite commented 10 years ago

@mbj @dkubb Wow, please keep us updated of your progress. I'm not sure a 1:1 transform will be possible, Crystal is very different from Ruby (it lacks most of the magic you can do in Ruby like define_method), but I'm sure something similar can be done.

mbj commented 10 years ago

@asterite The ruby code we tend to write uses "less magic" than the average ruby code. Most of our design could be easily moved to a statically typed language.

asterite commented 10 years ago

@mbj Excellent :-)

Just something I forgot to say: the Transformer doesn't clone nodes. So in the example I gave, if you print "nodes" it will be the same as "transformed" (in fact they are the same node). If you want to transform a copy, the simplest way would be to clone the nodes before the transformation:

transformed = nodes.clone.transform(MyTransformer.new)

mbj commented 10 years ago

@asterite I generally prefer not to mutate my inputs, thx for the pointer. And I'll probably use another strategy for transforming than the one above. Is there the concept of Object#freeze in crystal?

asterite commented 10 years ago

@mbj No, there's no such concept. Unlike Ruby, Strings are immutable. I think immutability should be at the class level (this class is immutable, this one is not, etc.) instead of having it at the object level. Otherwise every access to your object has this performance hit, even if you don't use it (I personally never used freeze/frozen in my life, except once because I accidentally modified a string that was used as a constant, but that can't happen in Crystal, so...).

Why will you use another strategy?

mbj commented 10 years ago

No. I'd actually love if a class is itself known to produce immutable instances. This would replace what we did in https://github.com/dkubb/adamantium.

No other strategies are needed. Is there a concept of immutable buildin collection types? Array / Hash?

asterite commented 10 years ago

In memoizable, where is the memoized value stored?

If the class doesn't have attr_writer (class "Example" in the github page of adamantium doesn't have attr_writer), why the need to say it's immutable? Is it because you have to memoize the values? If the memoized values are stored in the instance itself, then I would say it's not immutable. If it is stored somewhere else (like a big hash table), I would say that is not very performant (but that would make the class immutable, I guess).

There aren't built-in immutable collections. Maybe in the future there will be a tuple type that is immutable, and also a named tuple type that would work like an immutable hash (but more efficient). We are not sure. But immutable versions of Array and Hash can be easily implemented (except that method forwarding is cumbersome to do for the moment, there's no method_missing + send).

When and why you use immutable objects in your code?

(sorry for the many questions...)

mbj commented 10 years ago

@asterite Root reason: I use immutable objects to constrain my brain when it is designing software. I discovered over the time its yields better designs than mutating all over the place.

In my latest developments you see around 90% immutable code style and 10% mutable stuff.

I only use mutable datastructures as a shortcut when I dont have a nice "immutable idea".

There are less esoteric reasons for using immutable constructs:

Easier to test, because no state mutations must be covered
Less mutations, because no state mutation code yields less valid mutations to kill
Smaller classes, because a class instance does not hold mutation logic.
Most value transforming computation happens on class initialization. If you only have query methods it makes it easy to focus on reviews.

There are more reasons, but I'm up for to long. I'm happy talk another time.

I'll answer the memoizable implementation details tomorrow. //cc @dkubb If you have fun to jump in.

crystal-lang / crystal

What is the interface to get an AST representation for given crystal code? #65