Open zah opened 6 years ago
I can see one big problem with this approach: compiling bigger programs takes long time (e.g. hot/without any changes recompilation of Nim compiler takes 4.8 s on my laptop, but my another macro rich project takes 15 seconds). Waiting that long makes live reload a lot less useful, so I guess working compiler cache (issue nim-lang/RFCs#46) is a prerequisite to this.
Otherwise, I really like this proposal.
It was asked on IRC how this differs from the current REPL, started with nim secret
.
The major difference is that the hot code-reloading discussed here will be able to reload code using arbitrary C dependencies, while preserving near native execution speed. This will make it well suitable for high-performance software such as games.
Very exciting,
The reference C++ Jupyter kernel is Xeus which is maintained by QuantStack who maintains a high-performance scientific C++ stack for finance. They are known for xtensor, a NumPy-like library for C++.
Xeus is talked about on Jupyter blog and it doesn't seem like they have compilation speed issue.
However they rely on cling, an LLVM/Clang-based C++ interpreter.
It might be that the best first step is getting NLVM in shape and reuse Xeus-Cling infrastructure.
@mratsim, I'm confident that the approach discussed here is superior to Cling. A more compelling solution for C++ is the commercial tool Live++, described here: https://molecular-matters.com/products_livepp.html
@zielmicha, I agree that the hot code-reloading will be even more useful once we have quicker compilations, but these two efforts are orthogonal - you can develop them in any order. It's complicated to arrange for someone to work full-time on Nim for a long period of time and Viktor's opportunity window is limited to the suggested months.
I also share the same concern as @zielmicha and I have an alternative proposal: extend the Nim VM to support the FFI. This not only would make nim secret
into a super fast REPL but it would also mean we could get rid of all the VM-specific functions (nimscript
module).
Did you consider this? If so, what do you think are the advantages of implementing hot code reloading instead?
@dom96, of course. The benefit of having proper hot code reloading is that you can run at near native speeds. Otherwise, the performance penalty caused by interpretation will make the feature unpractical for high-performance software such as simulations and games.
Also, @zielmicha is highlighting the problem that Nim needs a lot of time to figure out what changed in a large program and this would be true regardless of which code execution mechanism you choose. We know from experience that the actual DLL reloading is quite fast and if your program is small (which will be the case in a typical REPL or Jupyter session), the compile-reload cycle will be pretty fast. We'll try to do some experiments with reducing the size of the reloaded components and the long term goal here is to reach the level of interactivity shown in Bret Viktor's video where you can interactively use a slider to change the value of a constant appearing in the code and immediately see the results on your screen.
Nice! https://github.com/Lokathor/hotload-win32-rs does something similar.. technically, so do most plugin platforms that support loading/unloading so it's certainly a feasible approach.
@mratsim I'd tend to agree that this approach is probably easier to pull off than hot-reloading sections of code using nlvm
/ LLVM
. LLVM
will likely be able to do more granular patches more easily - ie recompile only changed functions and patch those in leading overall to better perf (fewer steps from nim code change to ready-to-run machine code), but the product would be more complicated as well, and have a heavy dep on llvm limiting its usefulness compared to being compatible with any c compiler. In case you're interested in how REPL is done in llvm, their basic intro-to-the-compiler tutorial describes it pretty well: https://llvm.org/docs/tutorial/LangImpl04.html - that said, nlvm itself would need finishing off - the c backend is simply in much better shape.
I guess the combination of @dom96's FFI approach and this proposal could be called a JIT - it could simply invoke the more lengthy compilation when it's worth it..
This may be kinda relevant. Pretty similar to what you would do in C, but of course, your proposal is describing a more sophisticated system that can preserve global state without the programmer manually having to "hang on" to a state object between hotloads.
Sophisticated JITs might be a novelty, but I like @dom96 's idea about working on Nimscript. Right now embedding Nimscript in a Nim application and getting Nimscript to talk to compiled Nim code is kinda...cumbersome and needs work.
I agree with @mratsim on the whole thing that if we want to talk about JITs, we should renew interest in NLVM and integrate it with MCJIT. Of some particular interest may be how the Scopes language tackles live code generation, where a program can be partially statically compiled and partially compiled at runtime (via MCJIT).
May be that's irrelevant to this particular discussion, but I always wondered - why Nimscript? We can compile C code to asm with 100 KB TinyCC - no need to have separate VM just for scripting/CTFE.
This is exiting to see, and I'm glad that a respectable budget has been applied.
When looking at LISP, JIT compilation can be extremely effective after caching, although with Nim's architecture, I'm unsure if this is as effective.
We should take into account @Araq's comments here.
Considerations should be applied to ensure the full Nim language is accessible from compiled libraries, and not a static version of files lacking important feature sets.
Given the short timeline for a single developer, we should be especially careful not to leave a partial result:
The dll
/so
solution may be effective now, but we should ensure that it's both extensible and forward facing. It would be a shame to split the communities contributions between those pursuing a (possibly) inextensible JIT and those pursing more structured alternatives.
Perhaps first splitting and grouping macro definitions into separate sets would leave the pure functions more available to compilation. Then leaving the macros available for JIT REPL style evaluation.
Perhaps evaluating some of the issues raised in this comment could create a hybrid approach. Able to offer iterative results while allowing the proposal a sustainable growth for the future. (Especially after Feb 2019)
May be that's irrelevant to this particular discussion, but I always wondered - why Nimscript? We can compile C code to asm with 100 KB TinyCC - no need to have separate VM just for scripting/CTFE.
The VM is invoked all the time to compile tiny snippets of stuff - the ping-pong needed to compile these tiny snippets with an external compile would be very slow indeed.
it has crossed my mind however, as the next thing for nlvm
, to rip out the VM and replace it with the LLVM one.. could be intresting to throw the LLVM jit at it at the same time, in case there are any vm call sequences that actually would benefit from a round of "nativisation".. if this was to be done, the nlvm
repl would be childs play after that, but it's a fairly substantial piece that's not well supported by the upstream compiler - it doesn't have the concept of swappable VM:s (at least I didn't see it last I looked), only backends
TinyCC is library. Although I don't know, may be it require too much time to setup the environment.
LLVM has size of 20-40 MB, I don't know exactly. I want to script my program with Nim, but cannot carry that weight with it. So, TinyCC looks as best solution for me, although of course it's nowhere as fast as proper JIT compiler.
The VM is invoked all the time to compile tiny snippets of stuff - the ping-pong needed to compile these tiny snippets with an external compile would be very slow indeed.
It's not really about speed. Every VM you plug into Nim needs to be able to build Nim's ASTs. And not just ASTs, also the type graphs are exposed in the macro system. Things can be done differently with the incremental compilation cache as that also gives us a serialized AST/type representation that could be passed to external processes / native code / JIT compiled code.
Just to clarify a possible confusion here, this proposal is about giving you the ability to reload code and interact with running programs, it's not about changing how Nim executes macros at compile-time.
Very exciting! For recent similar projects in other languages (the sudden proliferation of this DLL/.so-based approach seems to me not accidental; I wonder who was the first to come up with the wonderful idea? :) ), which may hopefully be helpful to this project, see:
Any update on this? Has it been approved? Is there any work started on it? It has my vote. I think it's a cool feature, and has potential beyond just repls and jupyter notebooks.
Yes, it has been approved and somebody is working on it, we will annouce this properly soon. :-)
As mentioned, the development is already underway. If you are curious, you can keep track of the progress here: https://github.com/onqtam/Nim/commits/hot-code-reloading
@krux02, the compiler will replace all global variables with a scheme that allocates the required memory for them dynamically. All such allocations will happen in the context of a single dynamic library implementing the hot code reloading run-time. In the final code, the use of globals looks like this:
int* someGlobal_hash = (int*) runtime_registerGlobal("someGlobal_hash", sizeof(int));
// the call above will allocate memory the first time this particular name is encountered.
// follow-up reloads will return the same memory.
void someProcUsingGlobal() {
*someGlobal_hash = ...
}
The scheme for procs is a bit more complicated and involves maintaining a dynamically populated table of trampoline functions that can be used to call the latest definition of a particular function. All the code in the program is compiled to use such trampolines everywhere.
A low-level proof of concept implementing this scheme for x86/x64 was already pushed to the linked repo above. More platforms are coming soon.
The execution of the top level code is described in Point 1 of the technical details provided at the top of this issue.
The type of everything is encoded in the _hash
suffix appended to every function and global variable, so changing the type indeed creates a new global. Support for arbitrary type modifications is outside of the scope of the first release though. Let me quote the spec again:
The initial implementation will signal any modification of a data type between two reloads as an error, but future versions may be able to support modifications to traced garbage collected objects by allocating new copies of these objects and assigning the newly added fields to a default value. Please note that the user may also use the
beforeCodeReload
andafterCodeReload
event handlers to serialize the state of an arbitrary program to memory and then re-create it immediately after the reload.
I'm not sure I understand the last question correctly, but working with function pointers would be still fully supported. Everywhere you pass or receive a function pointer, you'll be working with an address of a trampoline. The address of the trampoline will remain the same even after the function behind it is reloaded.
The only category of functions that won't support arbitrary changes will be the closure iterators, and by extension all async code. In particular, you won't be able to introduce a new internal state of the closure iterator (by adding an yield statement for example).
@krux02, you can study the low-level proof-of-concept run-time implemented here: https://github.com/onqtam/Nim/blob/hot-code-reloading/lib/nimhcr.nim
There are comments at the top explaining the mechanism. Example usage from a "slave" process is shown here: https://github.com/onqtam/Nim/blob/hot-code-reloading/tests/dll/nimhcr_usage.nim
Viktor's talk at CppCon 2018 is online: https://www.youtube.com/watch?v=UEuA0yuw_O0
He mentions he “may” do something similar for Nim. So, he hasn't accepted the grant proposal yet, has he?
@moigagoo The talk was held 50 days ago and me working on this has been accepted since then. I also felt weird talking too much about Nim at a C++ conference. I'm already working on it in my fork but it's still veeery early to even try it out - I'm still getting into the compiler and trying to figure out how things work so its mostly experimentation.
The main focus for now is the first milestone - the support for the --hotCodeReloading
option. In terms of what the C code should look like - it is all figured out and should be possible - now "all" that is left is for the codegen of the compiler to emit the proper code.
I've been reluctant to write here until I had something to show... which is still not the case :D I've also been busy with other things - I presented at the code::dive conference a few days ago and tomorrow will be leaving for Meeting C++ as well, after which I should be working only on this.
I will let everyone know as soon as there is something that could be played with!
For the curious ones: you can checkout the progress in the comments at the top of nimhcr.nim
in my fork:
https://github.com/onqtam/Nim/blob/hcr2/lib/nimhcr.nim#L24
Currently simple examples compile and get hot-reloaded successfully, but will probably have issues with anything non-trivial. Perhaps the interesting thing is how it all works (see link above).
I'll be making a pull request soon with what is implemented so far. Took a bit more time than expected...
A little more info - an example such as this can be played with if you build the compiler from my branch:
# main.nim
import other
while true:
echo readLine(stdin)
performCodeReload()
echo getStr()
# other.nim
proc getStr*(): string = return "hello!"
and when compiled with --hotCodeReloading:on
you can:
lib/nimhcr.nim
and lib/nimrtl.nim
LD_LIBRARY_PATH
)other
module (cannot edit the main
one since parts of it are in the active callstack when calling performCodeReload()
)You should be able to do/use almost anything in the other
module - templates, macros, iterators, globals, import other files - as long as the changes don't trickle down to the main
module in terms of codegen.
Unfortunately currently most realistic software won't compile because -d:useNimRtl
isn't tested a lot.
@onqtam Great to hear about the code reload progress! I have a few questions:
@andreaferretti regarding your questions:
.cfg
files next to hcr/rtl so there is nothing special to do on your partmain
module)beforeCodeReload
/afterCodeReload
handlers - like this:var g = 42 # cannot change the value from here once loaded - use a handler for that
echo "hello - only the first time!"
afterCodeReload:
echo "I get printed after each reload!"
g = 666 # can change the global this way
This detailed description might be of interest :)
I am still extremely puzzled by the fact LLVM is not being used. The way LLVM deals with translation units cannot be emulated by divide and conquer into .so
imho.
Without taking into the account the amount of unknowns we introduce vs a stable library.
On the other end tho probably the bootstrapping of a LLVM JIT/Reload would have had much higher cost.. yet in the long term much higher chance to not become a not-used/half-baked feature...
update: there are just a few problems that need to be sorted out before making the first pull request. Also I'm already scheduled to spread the word about Nim at ACCU 2019 !!!
I am still extremely puzzled by the fact LLVM is not being used. The way LLVM deals with translation units cannot be emulated by divide and conquer into
.so
imho.
my hope in general would be that the code developed here helps that cause by cleaning up the hairy parts of the compiler that right now make it harder than it needs to be, and provide some of the infrastructure needed. a secondary hope would be that this .so
stuff is kept fairly isolated - those two together would make it fairly easy to plug it into https://github.com/arnetheduck/nlvm .
My talk from ACCU 2019 where I talk about the hot code-reloading is online - I'm not 100% happy with it but here it is anyway :)
https://www.youtube.com/watch?v=7WgCt0Wooeo
hackernews: https://news.ycombinator.com/item?id=19738572
reddit: https://www.reddit.com/r/programming/comments/bgvbym/nim_first_natively_compiled_language_with_hot/
lobste.rs: https://lobste.rs/s/e0skes/
Hi! I've tried compiling a Jester-based webapp recently with Nim devel and --hotCodeReloading:on
. It crashed because net
module uses epochTime
from times
, which is not available with -d:useNimRtl
flag.
So, my question is: will hot code reloading not work with web apps since they all rely on net
module?
Another question: if my app is a single file, will I be able to benefit from hot code reloading? I've watched @onqtam's presentation, and AFAU only imported modules are reloaded, not the main one. So, what if I only have one module, which is the main one?
Thanks!
@moigagoo anything that doesn't work with -d:useNimRtl
won't work with HCR either.
As for the single module program - try to refactor it into 2 modules where the main one has almost nothing but contains the main loop and does the reloading and calls into the functionality into the other.
@moigagoo Please report any problems with -d:useNimRtl
.
cling based C++ as scripting language / hot code reload Why? Able to run C++ script in runtime or compile it for max speed ( as in example https://github.com/derofim/cling-cmake )
HOT code reload possible approaches:
store app state fix cling undo for files https://root-forum.cern.ch/t/loading-unloading-class-as-interpreted-macro-in-cling-multiple-times/32976/2
execute cling code to change callbacks & variables nested cling::Interpreter with multiple cling::MetaProcessor IDK how to do it, but you can create child cling::Interpreter
Are there any updates or documentation on how to use this? I found https://nim-lang.github.io/Nim/hcr but it does not compile. I tried to run a simpler example without the broken sdl2 code but it says could not load: nimhcr.dll
# main.nim
import logic
while true:
echo getStr()
# logic.nim
proc getStr*(): string = return "hello!"
nim c --hotCodeReloading:on main.nim
Can I only run this on @onqtam's linked branch or is this in nim master? Because --hotCodeReloading:on
does exist in the currently stable (1.0.6) compiler.
you need nimrtl and nimhcr libraries. you can compile them from lib/nimrtl.nim
and lib/nimhcr.nim
in your nim installation directory.
Hey @SolitudeSF, thank you for the hint! I was able to successfully compile the two libs using
nim c --app:lib nimhcr.nim
and
nim c --app:lib -d:createNimRtl nimrtl.nim
which produced the two files nimhcr.dll
and nimrtl.dll
.
Unfortunately my main main.exe
still says could not load: nimhcr.dll
even with the two .dll's and my .exe all being in the same directory. 😞
EDIT:
I was incorrect. This was the old .exe complaining. I deleted it but a new .exe does not seem to be generated.
> nim c --hotCodeReloading:on -d:useNimRtl -o:main.exe main.nim
Hint: used config file 'C:\Users\kerskuchen\.choosenim\toolchains\nim-1.0.6\config\nim.cfg' [Conf]
Hint: system [Processing]
Hint: widestrs [Processing]
Hint: io [Processing]
Hint: main [Processing]
Hint: logic [Processing]
Hint: operation successful (14488 lines compiled; 0.141 sec total; 10.625MiB peakmem; Debug Build) [SuccessX]
> main.exe
'main.exe' is not recognized as an internal or external command,
operable program or batch file.
EDIT2:
I had to delete my nimcache. The main.exe
is being generated again but still complains
could not load: nimhcr.dll
even with the two .dll's and my .exe all being in the same directory. It seems I am still not quite understanding the whole system.
fwiw, I found some early jit code lying around in my nlvm folder - haven't touched it for a while, but it's able to run simple nim programs using the JIT that comes with LLVM (orc) - put it in a PR in case someone wants to take a look: https://github.com/arnetheduck/nlvm/pull/18
Basically, it'll compile the nim code to llvm IR, then launch the eager orc JIT - in terms of sophistication, it's somewhere around chapter 1 of the JIT tutorial - but it does already have access to all the code optimizers so the actual code runs pretty snappy (though compilation then takes more time).
Next steps would be to change it into on-demand compilation, then integrate it with the AST so that it can start executing before all nim modules have been processed - after that, it should be a simple matter to replace bits and pieces of the application like a JIT would. This would generally be done without the dll reloading that HCR tries to do - it's a different approach. Fun project if someone wants to play around.
This will be amazing ^^
@arnetheduck I'm pretty new to nim and not super familiar with LLVM's JIT or the features your branch provides, but this direction seems it would be ideal for REPL-driven development. Additionally, I imagine it would speed up nim's macro system (though I'm not sure, just basing this off of the first initial comments on this thread). Overall I think this will blur the lines between the "compile-time" and "run-time" worlds.
I think what's more interesting is to allow extending your application via nim code interactively, i.e. rather than using lua/python/JS as an embedded scripting langauge: you can just use nim, with the option of "keeping" the new code for future versions of your software, i.e. you can let your end-user write code to extend your application interactively with nim code with the option to promote this new code to native code.
I know you can partially do what I described above with nimscript right now, but I think what's more of a challenge is promoting nimscript to actually compile to the native backend (C, C++, JS, etc.) rather than requiring it to sit on top of a run-time for your application
Also, I found this talk for C++ with JIT support super interesting is a pretty nice display of what could be done with a JIT
My question to others:
inim
going to be the solution for the future?When or is LLVM backend is going to be merged into the main project?
I will give it another very serious look once IC is reasonably complete. Mostly because then we have the infrastructure ready for external code generators.
Is there another direction people are headed toward to implement REPL-driven development with nim? i.e. is inim going to be the solution for the future?
I have no idea and I don't use REPLs. ;-)
You can achieve HCR with Nim without the use of the language runtime feature / relying on something like NLVM. I have done it in my game engine, you just have to know how to use std/dynlib
effectively as well as know how to do this in C/C++. I personally just use this library to get the job done.
If you want to see how I've done this, you can find my game engine here.
Any progress on this?
Updates? Anyone get hot reload working on karax?
This is the first grant proposal intended for the grants program made possible by the partnership with Status to support the development of Nim.
The goal of this proposal is the implementation of hot code reloading capabilities for the native targets of Nim (C/C++/Obj-C) and consequently a REPL-like tool and a fully-functional Jupyter kernel based on these capabilities.
I've recruited a suitable candidate for the grant - Viktor Kirilov, known for the creation of doctest, a popular unit-testing library for C++, and RCRL, a REPL-like environment for C++ based on similar mechanics to the ones described here.
Viktor is also a blogger and an experienced conference presenter (he will be giving a talk about RCRL later this year at CPPCON). After completing the technical aspects of the project, he will prepare a demo program, a blog post and a technical talk promoting Nim and the newly developed capabilities. The content will be optimized to target game developers. A different volunteer will be selected to create similar promotional content intended for data scientists.
The timeframe of the grant will be 4 to 5 months of full-time work (Oct 2018 - Feb 2019, also known as Nim Winter of Code ;P). Milestones, described below, will be inspected along the way. I'll be mentoring Viktor and reviewing all of the submitted work. The proposed budget for the grant is 20,000 EUR.
Technical details of the proposal and suggested milestones
1. Support the
--hotCodeReloading
option in the native targets of NimCurrently, hot code reloading is supported only the JavaScript target with semantics described in the Nim compiler user guide. This proposal intends to change the semantics to the following:
The
hotCodeReloading
option enables special compilation mode where changes in the code can be applied automatically to a running program. The code reloading happens at the granularity of an individual module. When a module is reloaded, newly added global variables will be initialized, but all the other top-level code appearing in the module won't be re-executed and the state of all existing global variables will be preserved. One can use the special event handlersbeforeCodeReload
andafterCodeReload
to reset the state of a particular variable or to force the execution of certain statements:The code reloading event handlers can appear in any module multiple times. By default, on each code reload, Nim will execute all handlers defined in the entire program. If you want to prevent this behavior, you can guard the code though the
hasModuleChanged
magic:The hot code reloading is based on dynamic library hot swapping in the native targets and direct manipulation of the global namespace in the JavaScript target. The Nim compiler does not specify the mechanism for detecting the conditions when the code must be reloaded. Instead, the program code is expected to call
performCodeReload()
every time it wishes to reload its code. Thehotcodereloading
module provides easy-to-use helpers for implementing hot code reloading in GUI applications, servers based on async event loops and other types of programs. Thenim-livereload
NPM package provides a convenient solution for JavaScript projects based on LiveReload or BrowserSync.Rationale for the new semantics and description of the intended usage:
The new semantics were chosen, because they are deemed more flexible than the previous ones:
1) Just like before, you have a precise control of which variables and statements will be executed on each reload.
2) You can now execute code in module
A
even when the change happens in moduleB
. You get precise control when this should happen.How does this work in practice? You would keep your code open in your favourite text editor. If you are making a change affecting only the behavior of certain functions, you just save your code and as soon as it's recompiled you'll see the new behavior in the running program (e.g. you can modify a view function in Karax or a rendering algorithm in a game and immediately see the results on the screen. See the famous talk Inventing on principle by Bret Victor as an inspiration). If you want to send a command to a running program, just like in a REPL environment, you can modify some of the
afterCodeReload
blocks or add a new one. The specified code will be executed immediately with the next reload. You can use this to make arbitrary manipulations to the state of your program and you can also use it to inspect the state of any variable by sending it to a logger or a visualization routine. Environments similar to SLIME may hide this interaction behind a simpler interface that sends lines of code for execution or evaluates selected expressions interactively.Implementation details:
The codegen will ensure that all calls and global var references are routed through patchable jump tables. The program will be compiled to a dynamic link library (DLL, SO, etc), which will be hot swapped on each reload. It will inspect and patch the jump tables immediately after being loaded. The required indirections are expected to lead to a small, but tolerable performance hit which will affect only the development builds of the program. The initial implementation will target portable C, but some future directions will be provided for exploiting platform-specific mechanisms such as the hot patching support offered by some compilers. The initial implementation will signal any modification of a data type between two reloads as an error, but future versions may be able to support modifications to traced garbage collected objects by allocating new copies of these objects and assigning the newly added fields to a default value. Please note that the user may also use the
beforeCodeReload
andafterCodeReload
event handlers to serialize the state of an arbitrary program to memory and then re-create it immediately after the reload.2. Implement a REPL-like console for Nim
It may not be obvious, but the capabilities described in the previous section can be used to implement a REPL for Nim.
1) When the REPL is started, it creates an empty program file behind the scenes.
2) Each time you enter a line in the REPL, this line gets inserted either as a new global variable or inside an
afterCodeReload
handler. Entered expressions are automatically wrapped in anecho repr(e)
call.3) The REPL is responsible for recompiling the program behind the scenes and for cleaning up all compilation artifacts after the session is over.
3. Implement a Jupyter kernel for Nim
1) Similar to a REPL, a Jupyter kernel will just have to maintain a behind-the-scenes program file where each cell is being compiled to a function.
2) Executing cells on demand is equivalent to briefly adding a
afterCodeReload
placing a call.3) A library with overloads for different types is responsible for turning the Nim expressions at the end of each cell into visualizations that can be displayed by Jupyter.
4. Develop an interactive demo
This may be a project based on OpenGL, where the user can interact and modify a simple 3D scene by live editing the code. Suggestions and ideas are welcome.
5. Blog post
Viktor will publish a blog post describing the features and the developed demo and try to spread the word in all the relevant communities.
6. Tech talk
A tech talk will be prepared and proposed to a conference focusing on game development (Viktor has a background in game development and we believe this is the audience that will benefit the most from the new features).