Add crystal-expanded & crystal-normalized emit options

bew commented 6 years ago

From time to time I want to see the generated Crystal code at several specific stages of the Crystal compiler:

all the code after all top level macro expansions
the code of one class (or one file) after all top level macro expansions
the code after semantic pass
the code that will actually be compiled into binary (normalized & with all dead code removed)
...

For a start, I think it would be nice to have

--emit crystal-expanded: dump the crystal code with all macros expanded
--emit crystal-normalized: dump the crystal code after the normalization pass (which transforms all syntactic sugar to basic control structures, like unless -> if, until -> while, etc..). Also at this stage, the literals would already be expanded (e.g: string interpolation to String.build ...) This dump would be quite hard to read I think, but would still be Crystal code (raw llvm IR is also nice, but it's another lower level)

(or with crystal => cr to reduce size of flags)

Those options could be used with --no-codegen, and the resulting files be compiled again into binary if needed (as they are valid Crystal code).

It's just ideas on how to inspect what's going on in the compiler, there are many things I didn't talked about that would be cool (e.g: after the type inference pass, or simply the list of symbols/unique string literals, ..)

Also in the long run it could help creating new stdlib (see what is needed), help during ports to new platforms (see what's used?), and in a way, inspecting into the compiler (via emit or tools, or compiler plugins) could allow to make #921 possible? (I think I'm going a bit too far for this issue, but I'd love all those things!!)

faustinoaq commented 6 years ago

This feature would be very useful, use-case here: https://github.com/crystal-lang-tools/vscode-crystal-lang/issues/4

faustinoaq commented 6 years ago

I just got another idea, I think printing typed code would be very useful, by example:

Original code: foo.cr

FOO_METHODS = %w(foo bar)

{% for method in FOO_METHODS %}
def {{method.id}}(baz)
end
{% end %}

foo("foo")
bar(42)

Macros expanded: foo.expanded.cr

def foo(baz)
end

def bar(baz)
end

foo("foo")
bar(42)

Typed code: foo.typed.cr

def foo(baz : String) : Nil
end

def bar(baz : Int32) : Nil
end

foo("foo")
bar(42)

This can be very useful for editor extensions and tools :+1:

Maybe a flag like --emit crystal-typed ? :sweat_smile:

faustinoaq commented 6 years ago

Another use-case is tracking macro errors, by example, I'm implementing new error pages for amber framework, and sometimes I can't see the code snippet for an error because the original file is a macro. If I can save the expanded macro code somewhere, then, I think I could show better error snippets:

screenshot_20180414_210719

asterite commented 6 years ago

The compiler shows macro source on errors...

faustinoaq commented 6 years ago

@asterite Yeah, but would be nice to save expanded macros to a file, so I can read them and do nice things like debugging the expanded macro.

Another suggestion is to save intermediate code with inferred types to a file (a.k.a. --emit-typed-crystal), this would be very useful for dev tools and editor extensions, to get faster/better completion and debugging.

By example, currently to get types and do some type debugging, I need to execute crystal tool context on every location I need (-c LOC, --cursor LOC) even if the code is unmodified. With something like --emit-typed-crystal I just run it once to get type info for all the project (until code is changed).

So, WDYT? :sweat_smile:

faustinoaq commented 6 years ago

The compiler shows macro source on errors...

@asterite About the new amber error pages, I guess I need to improve them :wink:

Currently I'm doing this:

Catches error output
Parses error output to get locations /file/path:line:column and some basic error info
Reads file locations to get code snippets (this is the tricky part, because expanded macros aren't saved to a file)
Rendes a ECR template with some error data and code snippets

So, I can't see expanded macro for a macro error because the location doesn't exist. :sweat_smile:

asterite commented 6 years ago

If the error is inside a macro, you can show it in a popup. But you usually need the whole trace.

Maybe someone can implement the typed output, but what's the input for that? The problem with Crystal is that to type code you need a main file, and it can be a program, a spec file, many spec files, etc. Do you specify a single main file in your editor? Will you cache the info per main file? If you are viewing some random file, which main file will you use?

I said it lately a couple of times: if you want the feature of every other statically compiled language, Crystal should be compiled modularly, that is, compile a file independently from other files. But that will never happen. So I personally don't think Crystal is a language for which a good IDE will exist. But if you manage to do it, than it'll be quite a feat :-)

faustinoaq commented 6 years ago

@asterite Thank your for your response!

But you usually need the whole trace.

Yeah, that's why I'm doing some refactoring in amber watch to make it pretty configurable, so, people can use --error-trace flag if they want :wink:

Also, I added the Show raw message option, so, developers can always read the full raw error message.

screenshot_20180414_132630

The problem with Crystal is that to type code you need a main file

Hehe, In fact, I have a mainFile config for my editor :sweat_smile:

https://github.com/crystal-lang-tools/vscode-crystal-lang/wiki/Settings#mainfile

So I personally don't think Crystal is a language for which a good IDE will exist. But if you manage to do it, than it'll be quite a feat :-)

No problem :sweat_smile: I still think crystal community can do some nice things to make scry and other crystal tools better and faster :muscle:

For now, I just need something like @bew suggested, during compilation before code generation phase, allow us to emit:

Expanded macro code: --emit-crystal-expanded
Normalized crystal code --emit-crystal-normalized:
Typed crystal code: --emit-crystal-typed

Parse:                             00:00:00.000102665 (   0.19MB)
Semantic (top level):              00:00:00.922344024 (  84.57MB)
Semantic (new):                    00:00:00.004823701 (  84.57MB)
Semantic (type declarations):      00:00:00.122542724 (  84.57MB)
Semantic (abstract def check):     00:00:00.021798212 (  84.57MB)
Semantic (ivars initializers):     00:00:00.461345153 ( 124.63MB)
Semantic (cvars initializers):     00:00:00.020537517 ( 124.63MB)
Semantic (main):                   00:00:03.310385641 ( 301.00MB)
Semantic (cleanup):                00:00:00.016460185 ( 301.00MB)
Semantic (recursive struct check): 00:00:00.002591033 ( 301.00MB)
>>>>>>>>>>>>> Emit code just right here I guess :-) <<<<<<<<<<<<
Codegen (crystal):                 00:00:03.523409110 ( 318.50MB)
Codegen (bc+obj):                  00:00:00.708628038 ( 326.50MB)
Codegen (linking):                 00:00:02.285434906 ( 326.50MB)

We already have emit flag. Perhaps we can add more options to it, like

--emit [asm|llvm-bc|llvm-ir|obj|cr-macros|cr-normalized|cr-typed]
Comma separated list of types of output for the compiler to emit

WDYT?

asterite commented 6 years ago

Could you show what output you expect from each, and how are you going to process it?

faustinoaq commented 6 years ago

Ok :+1: , @asterite, Given a simple file /home/user/Projects/example/foo.cr with the code:

FOO_METHODS = %w(foo bar)

{% for method in FOO_METHODS %}
def {{method.id}}(baz)
  p baz
end
{% end %}

foo("foo")
foo(3.141516)
bar(42)

and compiled with the following command:

crystal build foo.cr --no-codegen --emit cr-expanded --emit cr-normalized --emit cr-typed

The compiler can generate these files:

└── Projects
    └── example
        ├── foo.cr
        ├── foo.expanded.cr
        ├── foo.normalized.cr
        └── foo.typed.cr

NOTE: I know that generated files won't look as beautiful as I think, because they require the prelude and a lot of code from the stdlib, although, they are still pretty useful for debugging and developing other crystal tools. Perhaps, in the future we can do some enhancements. I do this because we need something to start with.

First foo.expanded.cr would looks like:

# ...
# ...A bunch of code above added by crystal compiler

def foo(baz)
  p baz
end

def bar(baz)
  p baz
end

foo("foo")
foo(3.141516)
bar(42)

# A bunch of code below added by crystal...
# ...

and foo.normalized.cr like:

# ...
# ...A bunch of code above added by crystal compiler
def foo(baz)
  PrettyPrint.format(baz, STDOUT, 79)
  STDOUT.puts
  baz
end

def bar(baz)
  PrettyPrint.format(baz, STDOUT, 79)
  STDOUT.puts
  baz
end

foo("foo")
foo(3.141516)
bar(42)

# A bunch of code below added by crystal...
# ...

and finally, foo.typed.cr would be generated like:

# ...
# ...A bunch of code above added by crystal compiler

def foo(baz : String) : String
  PrettyPrint.format(baz, STDOUT, 79)
  STDOUT.puts
  baz
end

def foo(baz : Float64) : Float64
  PrettyPrint.format(baz, STDOUT, 79)
  STDOUT.puts
  baz
end

def bar(baz : Int32) : Int32 
  PrettyPrint.format(baz, STDOUT, 79)
  STDOUT.puts
  baz
end

foo("foo")
foo(3.141516)
bar(42)

# A bunch of code below added by crystal...
# ...

For a project with many files I think we can merge it in one big file depending on what file is being compiled, for example, for the following project:

└── example
    ├── src
    │   ├── spec_helper.cr
    │   └── example_spec.cr
    └── src
        ├── example
        │   ├── version.cr
        │   ├── buzz.cr
        │   ├── fizz.cr
        │   ├── foo.cr
        │   └── bar.cr
        └── example.cr

we can do something like:

crystal build src/example.cr --emit cr-expanded --emit cr-normalized --emit cr-typed

and get something like:

└── example
    ├── src
    │   └── ...
    └── src
    │   ├── example
    │   │   └── ...
    │   └── example.cr
    ├── example               # => binary file
    ├── example.expanded.cr   # => macros expanded
    ├── example.normalized.cr # => code normalized
    └── example.typed.cr      # => code typed

or If this is a bit noise then I think we can use a tmp directory:

└── example
    ├── src
    │   └── ...
    ├── src
    │   ├── example
    │   │   └── ...
    │   └── example.cr
    ├── tmp
    │   ├── example.expanded.cr   # => macros expanded
    │   ├── example.normalized.cr # => code normalized
    │   └── example.typed.cr      # => code typed
    └── example                   # => binary file

Maybe we can start saving expanded macros to a file using crystal tool expand src/example.cr and generating tmp/example.expanded.cr which can include source code with expanded macros for all required files inside example project.

Another option for generating tmp/example.typed.cr is saving typed source code when using crystal tool context src/example.cr which can include source code with typed stuff for all required files inside example project.

And as I said before. I know this intermediate code can be very dirty/noise, although, IMO this is still very useful for debugging and other nice things, like improving analyzing and auto-completion for a project.

And finally, I know this intermediate "generated" code can change every time I do some code change, no problem with that, because still would be very useful until I edit some file :wink:

So, WDYT? :sweat_smile:

asterite commented 6 years ago

Not sure. As I said many times, the compiler doesn't work file-by-file, it slurps all the files and works on that at once. So generating instantiated code per file maybe could work, but it's not trivial. And then there's macros, which expanded don't belong to any file. And then, not sure why you would need the expansions... I mean, they are not typed, it seems they just have the method argument and return types.

Plus it's a lot of work (今時間がない). I think anyone could grab the compiler's source and do it if they wanted, it's similar to other late-passes the compiler has like the type hierarchy.

faustinoaq commented 6 years ago

Not sure. As I said many times, the compiler doesn't work file-by-file, it slurps all the files and works on that at once. So generating instantiated code per file maybe could work, but it's not trivial. And then there's macros, which expanded don't belong to any file

Perhaps we can output a big file with all the stuff (exanded macros, normalized and typed code) :smile:

And then, not sure why you would need the expansions... I mean, they are not typed, it seems they just have the method argument and return types.

debugging generated source code is is the main use-case, see: https://github.com/crystal-lang-tools/vscode-crystal-lang/issues/4

Plus it's a lot of work (今時間がない). I think anyone could grab the compiler's source and do it if they wanted, it's similar to other late-passes the compiler has like the type hierarchy.

Don't worry, no problem about that, I'm glad you read my comments and sent us nice responses :sparkles: :tada:

I guess the crystal community (or even myself) could invest some time and understand the compiler enough to implement such features :wink:

faustinoaq commented 6 years ago

Oh, I just found LiteralExpander, so, now I understand the --emit normalized code proposal @bew :+1:

HertzDevil commented 3 years ago

Both the normalizer and the literal expander operate over AST nodes and should not require semantic analysis, so they could be options to crystal tool expand, or simply be exposed by a new tool. (expand was presumably added way after this issue was created.)

My use case is to add this and crystal tool format as Crystal tools for Compiler Explorer, simply because the output will be interesting to look at. (For C/C++ they already have clang-format support, so this makes sense.)

HertzDevil commented 3 years ago

Also the environment variables AFTER=1 and DUMP=1 will print all the Crystal code after the clean-up phase and all the LLVM IR to the standard output respectively. (Of course these stages happen much later than normalization.)

crystal-lang / crystal

Add crystal-expanded & crystal-normalized emit options #5821