Araq commented 3 months ago

Summary: "ROD files without their bloat."

This is an evolution of "NIR", the "Nim immediate representation". Currently NIR does not solve enough problems, esp. when viewed through the lense of incremental compilation and language evolution. With NIR we cannot even share templates, iterators or generics between compilation units.

While NIR focusses on the backend, NIF focuses on the frontend. NIF does away with the trinity of (PNode, PSym, PType) and only uses a single tree structure. In the deserialization step one needs to recompute the PSym/PType structures as they don't exist in NIF. This might be tricky.

Types naturally can be connected to its "methods" much like in C++. The =hooks are first class citizens and things like ref T and array[65, T] must become nominal types in NIF so that we can attach =hooks to them easily.

File format

A NIF file is a binary file that consists of a fixed list of sections. A section can be a header, a description of its dependencies, an interface, a body of code, etc. There is also a text representation of a NIF file but that is only used for introspection and debugging purposes.

Header

The header starts with the cookie [byte(0), byte('N'), byte('I'), byte('F'), byte(sizeof(int)*8), byte(system.cpuEndian), byte(0), byte(NifVersion)].

After the header a configuration footprint is stored. This footprint is a string and compresses the various used command line options and configuration settings. The footprint's format is not specified as it supposed to change with every compiler release.

Dependencies

A list of (filename, checksum) pairs so that the IC mechanism knows whether one of its dependencies was changed and the corresponding .nim file needs to be recompiled.

Interface

A list of (identifier, <index into the body>) pairs so that a symbol can be looked up by its identifier/name.

Body

The body is a list of "packed AST" structures. The first entry in the list is the AST that corresponds to the full Nim module. The second entry is the AST of generated generic instantations. Other entries are currently unspecified.

Reexports

A section dedicated to Nim's export delegation feature.

Converters

A section dedicated to Nim's converter feature. Converters are very special because they can used implicitly without naming them.

Term rewriting macros

A section dedicated to Nim's term rewriting macros/templates. These are special because they can used implicitly without naming them.

Replay instructions

These are instructions that are used to replay the side effects the compilation of a module cause. For example: compile and link pragmas.

Hidden usages

Nimsuggest must be able to operate on a set of NIF files directly. "Find all usages" of a symbol has to work for templates too which are expanded inline. The compiler records these usages and stores them in the NIF file "hidden usages" section.

Type system

Types in NIF are just ASTs like others. However, builtin atomic types like system.int etc are mapped directly to a NodeKind. This makes it easier to work with these types.

To be continued...

hugosenari commented 3 months ago

It is related (solution) to #518 ?

juancarlospaco commented 3 months ago

So file format its kinda like a CSV with ASTs (with a standard header)?. 🤔

nim-lang / RFCs

NIF - Nim frontend #551