edubart / nelua-lang

Minimal, efficient, statically-typed and meta-programmable systems programming language heavily inspired by Lua, which compiles to C and native code.
https://nelua.io
MIT License
1.98k stars 64 forks source link

Question about GC improvements and memory management #69

Closed stefanos82 closed 3 years ago

stefanos82 commented 3 years ago

I remember reading these two Nim articles, it really blew my mind: https://nim-lang.org/blog/2020/12/08/introducing-orc.html and https://nim-lang.org/docs/destructors.html.

Do you think such implementation, such as of destructor mechanism, could simplify even further the memory management for Nelua's record(s)?

As a programming language designer @edubart, I would like to hear your thoughts around this topic.

edubart commented 3 years ago

First, as a Nim user for a good amount of time and a C++ for more than a decade I am very used to RAII, constructors/destructors, reference counting, smart pointers. I was even addicted to abuse of such mechanisms for a good part of my programming life, thus was even I my original goals of Nelua to offer the following 3 memory management mechanisms:

  1. Garbage collection (like Lua or Go)
  2. Manual memory management (like naive C)
  3. Automatic referencing counting (like Nim ARC/ORC, Swift ARC or modern C++)

While developing Nelua I first made 1 and 2, then I've even worked in 3, and it even shipped the Nelua master for some time (although under experimental and not documented). But while developing that at some point I've decided to cut that out because as you code and design such mechanisms you notice the amount of complexity is made, not just in the compiler and language design, but the cognitive load it causes on users and in the syntax. Plus the standard library would become complex and not that efficient in some places, (try to read for example C++ standard libraries, do you find that readable?). In summary the language would not be so simple anymore with such systems, thus not that pleasant to code and also not always efficient, all that diverges from Nelua simplicity and efficient goals. Moreover automatic referencing counting is not magic or that efficient as some assume, it trashes the CPU caches with referencing counting, thus depending on your application GC or manual memory management can be faster than reference counting.

What memory model to choose all depends on the application requirements, none of the 3 options is the best. The best always depends on your requirements, for some things you could use GC, for others manual memory management would be best, for some special stuff reference counting makes sense and that can still be done manually in Nelua, it's just not in the language goals to provide means to do this automatically because it would hurt some principles as I found in my research.

There is also a 4. way I think people should do more in the future

  1. Custom allocators and handles (like Zig and Odin)

This 4th way is what I currently aim for in the future of Nelua in terms of better memory management. It's the most efficient and logical for me today, and can be easier and faster than manual memory management. It's the most logical when you think about how your hardware works. Nelua already has some allocators for that in the standard library, but it's design is not finished yet so people should stick with GC or manual memory management at this moment.

The best memory management mechanism is to never allocate in the first place, if you design your application with a well thought data structure, custom allocators and everything preallocated you never need to allocate or free, I plan to demo more how to this with Nelua in the future, I've already some in-house games in Nelua not doing any allocation, just using custom allocators, handles and fixed buffers. In this design there is no cost of GC or referencing counting and leaks are impossible. The custom allocator can have data locality which is even better for the CPU cache and efficiency. The code complexity is way lower than ownership or referencing counting system would be in my opinion, and simpler than doing manual memory management because you can't have leaks if you never allocate, and can't have dangling pointers (use after free) if you use generational handles. This is a nice way software can be designed in my opinion while maintaining simplicity, bug free and efficiency.

Finally I will share here some articles on the topic that I feel similar thoughts for further reading:

https://www.gingerbill.org/article/2019/02/01/memory-allocation-strategies-001/ https://www.gingerbill.org/article/2020/06/21/the-ownership-semantics-flaw/ https://floooh.github.io/2018/06/17/handles-vs-pointers.html

stefanos82 commented 3 years ago

First, as a Nim user for a good amount of time and a C++ for more than a decade I am very used to RAII, constructors/destructors, reference counting, smart pointers. I was even addicted to abuse of such mechanisms for a good part of my programming life, thus was even I my original goals of Nelua to offer the following 3 memory management mechanisms:

1. Garbage collection (like Lua or Go)

2. Manual memory management (like naive C)

3. Automatic referencing counting (like Nim ARC/ORC, Swift ARC or modern C++)

While developing Nelua I first made 1 and 2, then I've even worked in 3, and it even shipped the Nelua master for some time (although under experimental and not documented).

I thought you would follow this route and order of choice due to the nature of game development and your accumulated experience in such field.

But while developing that at some point I've decided to cut that out because as you code and design such mechanisms you notice the amount of complexity is made, not just in the compiler and language design, but the cognitive load it causes on users and in the syntax. Plus the standard library would become complex and not that efficient in some places, (try to read for example C++ standard libraries, do you find that readable?).

About reading STL and C++ libraries in general...yeah, I feel you!

I could be wrong, but I have the impression elite engineers and developers compete with each other for the sake of showoff without paying attention to usability, let alone readability and comprehension.

In summary the language would not be so simple anymore with such systems, thus not that pleasant to code and also not always efficient, all that diverges from Nelua simplicity and efficient goals. Moreover automatic referencing counting is not magic or that efficient as some assume, it trashes the CPU caches with referencing counting, thus depending on your application GC or manual memory management can be faster than reference counting.

Basically what I have had in mind is how RAII works, especially with modern C++ (smart pointers etc), things got a lot "safer" comparing the past times with legacy code and tricky low-level techniques that you were forced to use to handle memory. It's to know you don't have to worry about releasing memory, that the language does it for you via RAII; whatever goes out of scope gets released.

What memory model to choose all depends on the application requirements, none of the 3 options is the best. The best always depends on your requirements, for some things you could use GC, for others manual memory management would be best, for some special stuff reference counting makes sense and that can still be done manually in Nelua, it's just not in the language goals to provide means to do this automatically because it would hurt some principles as I found in my research.

...but it can be implemented as an extension, a language extension that is via metaprogramming implementation, right?

There is also a 4. way I think people should do more in the future

1. Custom allocators and handles (like Zig and Odin)

This 4th way is what I currently aim for in the future of Nelua in terms of better memory management. It's the most efficient and logical for me today, and can be easier and faster than manual memory management. It's the most logical when you think about how your hardware works. Nelua already has some allocators for that in the standard library, but it's design is not finished yet so people should stick with GC or manual memory management at this moment.

The best memory management mechanism is to never allocate in the first place, if you design your application with a well thought data structure, custom allocators and everything preallocated you never need to allocate or free, I plan to demo more how to this with Nelua in the future, I've already some in-house games in Nelua not doing any allocation, just using custom allocators, handles and fixed buffers. In this design there is no cost of GC or referencing counting and leaks are impossible. The custom allocator can have data locality which is even better for the CPU cache and efficiency. The code complexity is way lower than ownership or referencing counting system would be in my opinion, and simpler than doing manual memory management because you can't have leaks if you never allocate, and can't have dangling pointers (use after free) if you use generational handles. This is a nice way software can be designed in my opinion while maintaining simplicity, bug free and efficiency.

I learned something new today, about custom allocators; cheers for sharing this valuable info.

I would love to see a demo around this concept to get a taste how it works.

Finally I will share here some articles on the topic that I feel similar thoughts for further reading:

https://www.gingerbill.org/article/2019/02/01/memory-allocation-strategies-001/ https://www.gingerbill.org/article/2020/06/21/the-ownership-semantics-flaw/ https://floooh.github.io/2018/06/17/handles-vs-pointers.html

I heard a podcast with Ginger Bill, Andrew Kelly, and another guy that I cannot remember his name; they shared incredible feedback, ideas, and bottleneck they all faced while trying to solve specific problems around the domain they were working on.

Andre's blog is a valuable resource around various topics. I already read https://floooh.github.io/2019/09/27/modern-c-for-cpp-peeps.html and enjoyed it.

I appreciate your thorough feedback @edubart; you are helping this "old" geek to finally embrace language design and implementation.

edubart commented 3 years ago

I heard a podcast with Ginger Bill, Andrew Kelly, and another guy that I cannot remember his name; they shared incredible feedback, ideas

Oh yea, I've heard that podcast, also I've read the articles you mentioned of Nim and others blogs from Andre long time ago. They are all good.

...but it can be implemented as an extension, a language extension that is via metaprogramming implementation, right?

Probably, as an example, let's say you want to implement Lua 5.4 style "destructors", the to-be-closed variables, it's way simpler than full destructors semantics as found in C++ because only variables marked with the annotation <close> are "destroyed" by calling the metamethod __close at the scope end. Implementing this feature modifying the Nelua compiler via the processor can serve as a good showcase example as Nelua does not have this feature official yet, but a naive implementation can be implemented via meta programming like in the following:

##[[
local typedefs = require 'nelua.typedefs'
local tabler = require 'nelua.utils.tabler'
local visitors = require 'nelua.analyzer'.visitors
typedefs.variable_annots.close = true -- define the `close` annotation
-- hook original VarDecl node visitor in the analyzer
local orig_VarDecl = visitors.VarDecl
function visitors.VarDecl(context, node)
  local idnodes = node[2] -- list of identifier declarations nodes
  for _,idnode in ipairs(idnodes) do -- iterate over identifier declarations nodes
    local symbol = idnode.attr -- get identifier symbol
    if symbol.close then -- identifier symbol has `close` annotation
      -- create a defer call to __close method
      local callnode = aster.Defer{aster.Block{
        aster.CallMethod{'__close', {}, aster.create_value(symbol)}
      }}
      -- inject defer call after variable declaration
      local blocknode = context:get_parent_node() -- get parent block node
      local statindex = tabler.ifind(blocknode, node) -- find this node index
      table.insert(blocknode, statindex+1, callnode) -- insert the new statement
    end
  end
  -- call original VarDecl
  return orig_VarDecl(context, node)
end
]]

require 'allocators.general'

local Object = @record{
  x: integer
}

function Object:__close()
  print 'object destroyed'
  general_allocator:delete(self)
end

do
  local o: *Object <close> = general_allocator:new(@Object)
  -- "defer o:__close() end" is injected here
  print 'object created'
  -- o:__close() will be called automatically here
end

If you run the above program, you should get this output:

object created
object destroyed

Note that the __close call has been injected via metaprogramming, and was not called explicitly. Although I am not encouraging doing things like this because lots of the compiler knowledge and APIs (which may change) are required, and all that will be undocumented until the day I think compiler internal APIs are stable enough.

The same thing could be done for full feature fledged destructors, but would be quite complex to do.