Closed tomerk closed 8 years ago
At a high level this looks pretty good. I'll provide some more detailed comments on the code shortly.
A couple things I'm missing: 1) How do you think a user would actually use the autocaching rule? Can you add it as an Optimizer option or something? 2) In that same vein - it might be nice to actually test this on (say) the VOC pipeline and ensure that a) profiling/caching is reasonably fast and b) that the outcomes are correct. The bulk of stuff needed for that is in my dag-estimation branch.
Given that auto-caching adds profiling overheads that we don't understand fully, can we make it so that it is not a default optimization? It would be great to then add a flag somewhere that allows it to be turned on easily, so that users can find out if it helps their pipeline.
Alright - I have given this a pretty thorough read through, and I think it is basically correct, modulo the few minor issues I pointed out. Will give it a second read for style, but if you could address the two things I pointed out (both of them may be fine, I just wanted to double check) then I think we are good to merge this one.
@etrain So do you want me to make a System flag -Dkeystoneml.autocache
, or just remove auto-caching from the default optimizer and provide a separate optional optimizer that does auto-caching?
I think resisting system flags as long as possible is a good idea. The latter proposal is good!
On Sat, Mar 12, 2016 at 11:33 AM, Tomer Kaftan notifications@github.com wrote:
@etrain https://github.com/etrain So do you want me to make a System flag -Dkeystoneml.autocache, or just remove auto-caching from the default optimizer and provide a separate optional optimizer that does auto-caching?
— Reply to this email directly or view it on GitHub https://github.com/amplab/keystone/pull/224#issuecomment-195794996.
Okay @etrain, done
Awesome work. Thanks @tomerk!
Added auto-caching rules w/ aggressive naive, and greedy profiling strategies @etrain @shivaram