IntelLabs / HPAT.jl

High Performance Analytics Toolkit (HPAT) is a Julia-based framework for big data analytics on clusters.
BSD 2-Clause "Simplified" License
120 stars 16 forks source link

Precompilation #12

Open timholy opened 8 years ago

timholy commented 8 years ago

Nice talk at JuliaCon 2016! Quite amazing work you folks are doing.

If you're not aware of it, there's at least some chance that SnoopCompile.jl might help---it will give you a list of precompile statements for different packages, and perhaps the combination of them might reduce your time-to-first-run. If it doesn't, AFAIK the key problem is presumably how julia currently caches compiled code in packages: if I have a module Amod that defines a type A, then precompilation won't preserve code for a function foo defined in some other module. This even applies to Base, so for example the code for push!(::Vector{A}, ::A) won't get saved because A is defined in Amod but push! is defined in Base.

In such situations, the userimg.jl trick or building a statically-compiled application seem to be your only alternatives. It might be viable in a "deployment" environment.

ehsantn commented 8 years ago

Thank you for the feedback, Tim!

I have tried SnoopCompile before. Unfortunately, the performance problem is in type inference as I mentioned in person. Precompilation doesn't work because of the exact problem you are describing. userimg.jl is only partially useful since the package is evolving rapidly.

BTW, since the type inference algorithm is less recursive, maybe SnoopCompile could be expanded for type inference as well. What do you think?

timholy commented 8 years ago

Glad you already knew about it, and sorry it doesn't help.

BTW, since the type inference algorithm is less recursive, maybe SnoopCompile could be expanded for type inference as well. What do you think?

I'm not sure I fully understand your meaning; do you mean, as a way to benchmark the time spent on inference? That does seem helpful. I believe @carnival has a julia branch that adds hooks in several places for monitoring time of the various stages of building, maybe he can post it somewhere :smile:.

ehsantn commented 8 years ago

Yes, benchmarking time spent on inference of different functions is definitely useful.

timholy commented 8 years ago

IIRC, on average about 40% is LLVM , 10% is inference, 30% is parsing&lowering (which you don't have if precompiled), and the rest is everything else. But there might be informative differences for different functions.

ehsantn commented 8 years ago

For ParallelAccelerator and HPAT, inference is much larger (70% I think) because they are not numerical codes.