sholsapp / gallocy

A distributed shared memory infrastructure.
27 stars 9 forks source link

Survey glibc and glibc++ to see which internally use memory. #15

Closed sholsapp closed 8 years ago

sholsapp commented 8 years ago

As we add more and more dependencies on code that we didn't write (i.e., json, curl, sqlite) we run the risk of these libraries making use of glibc and glibc++ functionality that internally uses memory via direct calls to malloc or new. This is bad when we deploy our application with the libgallocy-wrapper.so library, which replaces all calls to malloc. This effectively lets us contain the application and manage its memory. If the libgallocy-core.so library is making use of malloc, and thereby having its calls intercepted as well, we're polluting the application's memory space. This is problematic for a number of reasons.

We've added a new module named glibc where we can place wrappers or reimplementations of glibc and glibc++ functionality that the internal library depends on. One such example of something the internal library depends on is gmtime_r, which internally used malloc. We need to conduct a survey of glibc and glibc++ to see i) what are we using and ii) does it use memory internally.

We can discover what we're using by inspecting the built artifact.

$ nm install/lib/libgallocy-core.so | grep GLIBC
                 U abort@@GLIBC_2.2.5    
                 U accept@@GLIBC_2.2.5
                 U access@@GLIBC_2.2.5
                 U asctime_r@@GLIBC_2.2.5
                 U __assert_fail@@GLIBC_2.2.5
                 U bind@@GLIBC_2.2.5
                 U close@@GLIBC_2.2.5
                 U __cxa_atexit@@GLIBC_2.2.5
                 w __cxa_finalize@@GLIBC_2.2.5
                 U dlclose@@GLIBC_2.2.5
                 U dlerror@@GLIBC_2.2.5
                 U dlopen@@GLIBC_2.2.5
                 U dlsym@@GLIBC_2.2.5
                 U __errno_location@@GLIBC_2.2.5
                 U exit@@GLIBC_2.2.5
                 U fchmod@@GLIBC_2.2.5
                 U fchown@@GLIBC_2.2.5
                 ... lots more
sholsapp commented 8 years ago

It's pretty clear that many parts of gallocy are using both C and C++ standard libraries. We'll need to invest time into the pattern we'll use to keep internal standard library usage separate from application standard library usage.

From some quick research it seems the best way to achieve this is to statically link the standard libraries using gcc flag -static-libgcc and -static-libc++ (see https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html) along with visibility=hidden malloc and free implementations to modify gallocy's use of the standard libraries. I'll update with experimental findings when I put together a minimally viable sample.

sholsapp commented 8 years ago

I posted a question at http://stackoverflow.com/questions/35072500/malloc-function-interposition-in-the-standard-c-and-c-libraries that tries to capture the essence of this problem. It seems like what I want to do is possible, but strange, and might require that we compile a libc-equivalent like musl.

sholsapp commented 8 years ago

I'm closing this ticket since I've established that an application that makes use of any standard library functions might internally use memory, and that this memory will always come from the memory allocator that is dynamically linked at runtime. There is no way around it. Developing a strategy for handling this, whether it be a story that involves statically linking against a portion of libc or a story that justifies why the internal library (the DSM implementation) is allowed to leak into application memory (for standard library usage) is tracked in https://github.com/sholsapp/gallocy/issues/20.