luke-gru / riml

Riml is a subset of VimL with some nice added features. It compiles to plain VimL.
MIT License
224 stars 6 forks source link

Cache riml_includes between compile_files #16

Closed dsawardekar closed 10 years ago

dsawardekar commented 10 years ago

I just updated to 0.3.2, and it helped me figure out a bunch of dups I had been using without realizing. Thanks for that!

I also made an improvement to Speckle to use Riml.compile_files directly instead of bundle exec riml. This speeds up compilation quite a bit, and leads me to this feature request.

I have quite a few tests that do a lot of riml_include 'foo' style includes. For instance I have a couple of classes in files, log_helpers.riml and delegate.riml and dsl.riml. These get include in nearly every spec at the top like so,

riml_include 'dsl.riml'
riml_include 'log_helpers.riml'
riml_include 'delegate.riml'

For simpler files this is not a big issue but for E2E tests the whole plugin is included which makes compile times quite slow.

Would it be possible to cache the result of a riml_include for the duration of a riml compile session?

To clarify further, say I have E2E tests to run with includes like,

In controller_a_spec.riml

riml_include 'a.riml'
riml_include 'b.riml'
riml_include 'c.riml'

In controller_b_spec.riml

riml_include 'a.riml'
riml_include 'b.riml'
riml_include 'c.riml'

Now say when I run,

riml -c controller_a_spec.riml controller_b_spec.riml

The compiler should compile controller_a_spec.riml and it's includes a, b and c. But for controller_b_spec.riml, it should just return the previously compiled a, b, and c includes, from in-memory cache.

Thus compiling controller_b_spec.riml does not incur the cost of compiling a, b and c. It only has to assemble the rest of the contents of controller_b_spec.riml

This feature would speed up compilation by an order of magnitude for Speckle, by at least a factor of 5!

luke-gru commented 10 years ago

This is a great idea, thanks! Also, it's easy to implement :smile:. I am going to have to wrap a mutex around the cache though, with the way things work now.

How much faster is compilation right now with Riml.compile_files vs shelling out? I'm curious because it's compiling each file that's given to Riml.compile_files in its own thread, and if it's just 2 files then thread creation overhead may negate any speed benefits. Regardless, most of the speed benefits probably come from not creating new shells.

I'm also going to use yard to document all public API methods like this, to make it easier to use Riml outside of a command-line setting like you're doing here.

I'll let you know when this is ready :)

dsawardekar commented 10 years ago

How much faster is compilation right now with Riml.compile_files vs shelling out?

Quite a bit. It eliminates the boot time of ruby. And it's saving is proportional to the size of the suite. :smile:

I'm curious because it's compiling each file that's given to Riml.compile_files in its own thread, and if it's just 2 files then thread creation overhead may negate any speed benefits.

I suspect I'm not seeing this as I am passing in one file at a time to compile_files. I have a Rake FileList rather than an array, so looping over it with each.

dsawardekar commented 10 years ago

I noticed a boost in compile times, Looks like this is working in master?

luke-gru commented 10 years ago

Hey @dsawardekar, this is indeed working in master, I'm glad you noticed a speed increase :smile: I'll make a release tomorrow off, and this will be included.

Thanks again for the suggestion.

dsawardekar commented 10 years ago

Thanks. I will update Speckle, this is an awesome update! :)

dsawardekar commented 10 years ago

@luke-gru About this update, It shaved about 20 seconds of the build time of Portkey! So this was a really nice update.

Not trying to be greedy, But I think you can do even better! The time taken for a complete compile of Portkey is about 5 seconds. About 1 second for plugin.riml and 4 seconds for portkey.riml which is the autoloaded main application.

Yet the whole test suite compilation takes about 40-50 seconds. Some of the tests include the entire app with portkey_inc.riml which a file that includes all the individual classes. It still feels like it is compiling the individual includes.

In contrast running the test suite in Speckle zips by in under 1 second, compiling about 10+ files. Speckle does not have any nested includes.

Does the the riml_include cache take into account nested includes?

luke-gru commented 10 years ago

Hmm, this is interesting, as I suspected it did this already, but I didn't benchmark anything at all, so I kind of tested it the lazy way.

If you take a look at https://github.com/luke-gru/riml/blob/master/test/integration/riml_commands/compiler_test.rb#L287, the mocked cache object I'm using during testing is apparently called with the nested includes, but something must be wrong if it's taking so long to compile.

I'll take a look at this, and make sure to add some benchmarks to the testing process, as the compile times are a little worrying right now, 40-50 seconds is pretty bad. I'm sure we can do better than that :smile:

EDIT: I didn't know about Portkey! Looks awesome, I'm going to check it out.

dsawardekar commented 10 years ago

I'll take a look at this, and make sure to add some benchmarks to the testing process, as the compile times are a little worrying right now, 40-50 seconds is pretty bad. I'm sure we can do better than that :smile:

Would assembly after compilation take too much times? If you run rake in the Portkey project you'll see it will start off pretty fast with the initial files, then when it reaches the controllers it's slows down. These are the tests that include portkey_inc.riml.

You can even see this isolated with,

$ speckle spec/controllers/commands/alternate_file_command_spec.riml spec/controllers/commands/related_file_command_spec.riml

These are E2E tests. Both these specs are very similar in size. Compiling the alternate command should compile the entire app. The related command should then be just a quick cache lookup + the tests in the file itself which is only a handful of methods.

The related command still takes a 2-3 seconds to compile. Adding a number of these commands is what is adding up to 40+ seconds in the whole suite.

dsawardekar commented 10 years ago

@luke-gru Compile times now down to 12 seconds in Portkey in master. Awesome work! Are you certain there were no sleep(10) calls in there? :wink:

luke-gru commented 10 years ago

haha, thanks. No sleep calls :smile:, it turns out it was still doing lots of unnecessary ATS rewrites though to pick up classes during riml_include. I made the caching better, and then to top it off there was some low-hanging fruit optimizations that I made regarding Riml.source_path and Riml.include_path caching as well. Before, whenever the compiler hit a riml_source or riml_include line, it would go over each directory in the path and check the filesystem to see if the files existed. All the filenames in both paths are now cached when they're set, so this saves some time too.

dsawardekar commented 10 years ago

Neat stuff. I'm glad I kept asking for more. At 12 seconds it's around the same time it takes the full suite to run. Can't ask for more now! :)