dkubb / axiom

Simplifies querying of structured data using relational algebra
https://github.com/dkubb/axiom
MIT License
459 stars 22 forks source link

Dependency management #41

Closed zirni closed 10 years ago

zirni commented 10 years ago

Hello together,

I'm checking out this file https://github.com/dkubb/axiom/blob/master/lib/axiom.rb right now. The first thing I see, is that all external dependencies and all files in lib are required. Is there a good reason to avoid using autoload in this file? I think it would be good for lib users if the gem has defer logic like this in it especially for start up times.

Thanks :)

Best regards, zirni

dkubb commented 10 years ago

@zirni the main reason is that autoload is not thread safe across all ruby implementations. If multiple threads autoload a lib at the same time there's a race condition and you can get unexpected results.

Here's a link to a post on stack overflow with some more info:

http://stackoverflow.com/a/6555010/155270

This is a quote from the first ticket linked to in that post: (by Charles Nutter, the JRuby maintainer)

Currently autoload is not safe to use in a multi-threaded application. To put it more bluntly, it's broken.

This library was written with the intention that people would be using it in a threaded environment. Most of the objects are immutable and can be shared freely between different threads. In the future some of the in-memory operations will be parallelizable because they can be broken down into smaller tasks and distributed to multiple threads; we will also be exploiting this to handle joining results from multiple datastores.

I do wish this wasn't the case though. It would be nice to have the option. As it stands I can't trade thread-safety for autoloading. FWIW I have found that developing with guard and zeus has been a wonderful experience in cases where I wanted some things to be persistent, but I wanted my own code to be reloaded as I make changes.

mbj commented 10 years ago

@dkubb Heh, I just was about to type a long answer like yours. Good github has "auto update" saving me some minutes :D

zirni commented 10 years ago

@dkubb thanks for the quick answer :)

Maybe with ruby 2.0 we can have the ability to pre compile gems in byte code format. Maybe in connection with a comfortable bundler command. Here's a link of the upcoming ruby feature

https://www.ruby-forum.com/topic/4412461

dkubb commented 10 years ago

That would be awesome. I've kind of always wished that ruby had an actual compilation step, so that instead of requiring everything at runtime, it could load some compressed, compiled code up-front.

I've found in most apps there's a clear line between requiring a bunch of stuff, and then normal runtime. Besides autoloading, it's rare to require things at runtime dynamically. It's normal that the require phase is done in a single thread, and multiple threads aren't created until runtime.

Another nice approach would be a way of starting ruby where you specify you want things eager loaded. Then if any autoload statements are encountered during the "load phase" they could just be required immediately. A development mode could still do these things lazily assuming it was single threaded.

mbj commented 10 years ago

@dkubb Yeah, I feel the same. Additionaly I'd love to have a pivot call where I instruct the VM to drop all metaprogramming besides: invstance_variable_{get,set}, send, public_send. So it could optimize like any other "not so dynamic language". Most metaprogramming in my latest code is "finished" after loading the main lib with its dependencies.

zirni commented 10 years ago

@dkubb Agreed. So would you encourage to have a more restrictive mechanism for autoloads functionality instead of just droping it out from the ruby codebase as matz announced earlier? We could let require behave like autoload when it's a single threaded env and a special option was set. Just thinking about circumstances where that can be a problem :) @mbj is that why you want to control the VM ? :)

dkubb commented 10 years ago

@zirni for some autoload like behaviour you'd probably still need to specify which constants map to which files because the mapping isn't always 1:1, some files may contain multiple constants and sometimes the constants aren't even close to the name of the file. I'd guess we'd basically need autoload's current api, only you'd control it with some extra switch that would tell it to either eager load everything, or lazily load constants as they are needed.