JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.68k stars 5.48k forks source link

Precompilation should not be recursive #12545

Closed sbromberger closed 9 years ago

sbromberger commented 9 years ago

Hi,

I ran into an issue with my package when I included VERSION >= v"0.4.0-dev+6521" && __precompile__() per https://groups.google.com/d/msg/julia-users/RLlYPlsT-dU/mhdukrp-AgAJ

The issue is that this precompilation directive will also cause all dependent modules to precompile. In my case, one of them is not (and will likely never be) precompile-safe. This means that my package can't be precompiled at all unless I get the dependent package to explicitly disable precompilation via VERSION >= v"0.4.0-dev+6521" && __precompile__(false).

I'd really like to test/use precompilation for my own package, but I don't want to have to file PRs for every dependent non-precompile-safe package to explicitly disable precompilation.

Could the current behavior be changed to disable precompilation for dependent modules unless there's an explicit __precompile__() statement for it?

See also https://groups.google.com/d/msg/julia-users/RLlYPlsT-dU/Hxya41QWJwAJ & https://github.com/JuliaLang/LightXML.jl/pull/31#issuecomment-129607038 et seq.

johnmyleswhite commented 9 years ago

This is reasonable, but it seems like we should not allow any packages going forward not to list their precompilation compliance.

vtjnash commented 9 years ago

no, you only get precompilation if all of your dependent packages are precompilable. why don't you want to fix LightXML to handle precompilation? there are many packages that have been updated to handle this, such as https://github.com/stevengj/PyCall.jl.

ref https://github.com/JuliaLang/julia/issues/12508#issuecomment-128893171

sbromberger commented 9 years ago

@vtjnash -

why don't you want to fix LightXML to handle precompilation?

because it's not mine to fix (see https://github.com/JuliaLang/LightXML.jl/pull/31#issuecomment-129606752), and it's stopping LightGraphs.jl (unrelated) from being able to use precompilation?

If we want wider adoption of precompilation we should make it so that it's not dependent on third-party modules' compliance. As it is now, I can't use it and have to back it out where it would really be of benefit, because one of my dependencies isn't precompile-safe. A pity.

This problem as applied to LightXML alone prevents 11 Julia packages from being able to precompile.

sbromberger commented 9 years ago

(Is there a reason you closed this?)

quinnj commented 9 years ago

I'm no precompilation expert (yet), but I think the whole idea is that you only get precompilation if you can precompile all your dependencies. Like, imagine if Gadfly was "precompilable" while all it's dependencies weren't, you'd still be waiting ~30s to load the package. I think it's an all or nothing kind of feature.

sbromberger commented 9 years ago

@quinnj I understand, but in my case I think everything other than LightXML is precompilable. This reduces load time from ~5 seconds to near-instantaneous, even when LightXML isn't precompiled. There are advantages to having only a subset of dependencies able to precompile. Being able to provide testing feedback for the feature is another benefit.

If it's all or nothing, that means that widespread precompilation testing can only occur when some critical mass of packages are precompile-safe. Maybe we're there now, but again - there are at least 11 packages that can't precompile because of the LightXML dependency.

sbromberger commented 9 years ago

@vtjnash

no, you only get precompilation if all of your dependent packages are precompilable

After more thought, I don't understand what that means. Consider the following:

module Foo uses module Bar, which in turn uses module Baz. Baz is not precompile-safe. Neither Bar nor Baz have any precompile directives.

If Foo attempts precompilation, .ji files are created for all three modules. But because Baz is not precompile-safe, seemingly random errors now start occurring in Foo that didn't occur prior to the introduction of the seemingly-innocuous precompile directive.

More worryingly, if the errors aren't induced by Foo's tests, there will be no indication until runtime that anything's wrong, and even then it will not be obvious that the problem is due to the precompile-unsafeness of Baz.

This is a serious issue that probably deserves more discussion.

johnmyleswhite commented 9 years ago

I believe Jameson's point is that even a single package that includes __precompile(false)__ will prevent a chain of downstream dependencies from precompiling.

That behavior is correct, if potentially inconvenient. It merely requires that the authors of unsafe packages flag those packages as unsafe. That still hasn't happened, but it's not an issue with Base.

In a future iteration of Julia, perhaps something nicer is possible. But we're about to have a feature freeze, so we can't expect to make this feature more complex before 0.4.

sbromberger commented 9 years ago

I believe Jameson's point is that even a single package that includes precompile(false) will prevent a chain of downstream dependencies from precompiling.

This is correct behavior, but it is not complete behavior. The issue is, as you point out, unsafe packages aren't flagged - and this precludes anyone who's using these packages from precompiling their own modules. My suggestion was to precompile only modules with an explicit precompile() directive, regardless of dependency.

Making this change would allow those of us who want to help test this great new feature on our own code to do so without having to drag other code maintainers along for the ride.

tkelman commented 9 years ago

Precompilation is still a bit risky for this reason. It's not so innocuous - package authors better test one way or the other before enabling it for their packages. If you're testing this before your dependencies have done so, then it's especially risky for you.

We should expand on the docs for this feature if they aren't detailed enough yet. I could get behind being more conservative as you suggest and only auto-precompiling dependency modules that are explicitly marked as safe. But that could be a controversial change.

sbromberger commented 9 years ago

@tkelman what this means is that, as a package maintainer, I cannot even begin to test my precompilation safety until all of my dependencies have tested (and resolved) theirs. A cursory look at the dependency graphs created by MetadataTools should convince you why this is a suboptimal strategy for rolling out a really important feature.

mlubin commented 9 years ago

@sbromberger, sure you can test it, just don't release it out to users before you've confirmed that it's working and safe. Marking packages as precompilation safe/unsafe is going to soon be a responsibility of all package authors.

sbromberger commented 9 years ago

@mlubin - How do I test it when it won't pass tests due to the failure of one of the dependencies?

mlubin commented 9 years ago

Disable precompilation in the dependencies locally?

sbromberger commented 9 years ago

Sorry - not gonna happen. I'm not going to start creating local copies of these packages that diverge from whatever Pkg.add() installed.

What this means is that LightGraphs will be marked as unsafe for precompilation until such time as all its dependencies have been evaluated and I have had a chance to make sure my code is precompile-safe. I guess I'll be ready to go sometime in 2016.

sbromberger commented 9 years ago

(and I really don't understand the objection to limiting precompilation right now to just those modules that have a precompilation directive. What am I missing here?)

quinnj commented 9 years ago

@sbromberger, that's a pretty pessimistic view. There's a couple of different options here that require less feature-demanding of open-source volunteers:

In the mean time, we can keep discussing, I'm just trying to point out other options available to you here.

sbromberger commented 9 years ago

@quinnj you're right - sorry if I let my frustration come through in my penultimate post. I'm certainly open to working with the other package maintainers (and have even submitted a PR related to this already).

My frustration with this stems from the fact that I had such great performance gains with precompilation - and it all failed when I changed my dependencies. It's now a potentially recursive problem to solve, and it doesn't have to be.

Perhaps I'll take a look at the precompile logic and suggest a PR there that skips dependencies. Thanks for your patience and support during my mini-rant.

johnmyleswhite commented 9 years ago

(and I really don't understand the objection to limiting precompilation right now to just those modules that have a precompilation directive. What am I missing here?)

I don't think this is a goal for the system, but a technical constraint.

quinnj commented 9 years ago

No worries @sbromberger :) The way I see it, this is a potential "nice to have" kind of feature, but not critical, so those can sometimes be tough to get done unless someone really feels the need for it and just does it themselves.

sbromberger commented 9 years ago

Edit: need to suppress the errors as well, at a minimum.

stevengj commented 9 years ago

Because of the possibility cross-module inlining, I think it may be technically quite tricky to make it so that a precompiled module can (safely) import a non-precompiled module.

timholy commented 9 years ago

Yes, this is the reason it's a problem.

@sbromberger, I'm guessing you use LightXML for I/O? One option would be to consider FileIO, which has been designed to explicitly allow you to decouple your package from I/O dependencies. It's not yet registered, but once I achieve success with a 2nd package (locally I have it working for JLD, am working on ImageMagick) then I'll register and tag.

stevengj commented 9 years ago

For this particular case, LightXML will be safe to precompile shortly (JuliaLang/LightXML.jl#32).

sbromberger commented 9 years ago

@timholy We use LightXML to parse GraphML files - I think this is a little more than I/O but am happy to explore alternatives.

ETA: but it appears that due to @stevengj's work, we may not have to :) (thank you!)

sbromberger commented 9 years ago

PS, just in case anyone is still reading: @stevengj's answer here:

Because of the possibility cross-module inlining, I think it may be technically quite tricky to make it so that a precompiled module can (safely) import a non-precompiled module.

was what I was looking for earlier on and what made this all click for me that this wasn't merely an issue of convention.

timholy commented 9 years ago

Macros, too (a specific form of inlining). If you had B depends on A, and you allowed B to precompile if A couldn't, then you could get into this situation:

and then B would not be properly updated.

mbauman commented 9 years ago

Even if A isn't precompiled, couldn't it still be listed as a dependency in the ji file for B with a stored timestamp (assuming https://github.com/JuliaLang/julia/pull/12559)? Then modifications would still invalidate the cache, but it'd allow for non-precompiled dependencies.

timholy commented 9 years ago

Hmm, good point. Other cases (evaling into another module) might be trickier, but @vtjnash would have to comment here. With "version control" this might be more doable (but not for 0.4).

stevengj commented 9 years ago

@mbauman, dependencies are not the problem here. (You're correct that the .ji file can include arbitrary dependencies, including files from other non-compiled modules if needed.)

The problem is that if module Foo includes Bar, but Bar is not safe to precompile, then code inlined from Bar into Foo might be unsafe as well (i.e. the precompiled Foo might produce incorrect results or crash). As a result, you can only safely compile Foo if Bar is precompile-safe, and if this is the case then you might as well precompile Bar as well.

mbauman commented 9 years ago

Of course, that makes perfect sense. Thanks for the clarification.

sbromberger commented 9 years ago

I'm still leaning very much toward making precompilation default to false right now so that we don't get errors in packages that are precompile-safe and have assumed incorrectly that their dependencies are precompile-safe. That is, as someone who obviously doesn't understand the nuances of precompilation, I'd want to make sure my dependencies have explicitly tested and marked their modules safe before my module gets precompiled.

Right now the absence of a directive in a dependency will result in its precompilation, which may or may not introduce huge errors upstream.

vtjnash commented 9 years ago

Steven is correct. Precompilation is an inductive process, and you can't delete one step of the inductive reasoning and still keep the conclusion. So if the previous stage can't be precompiled, than neither can anything that depends on it. Whereas if the previous stage can be precompiled, then you might as well create the file since it has already done the work and you can get the precompiled version of the dependency "for free".

If you remove the check that ensures all of the dependencies are precompiled, you are more likely to see corruption and segfaults (or a slower version of the same). Whereas if you try to load them all from the cache, you should be more likely to see warnings and julia errors (and a faster load).

(inlining is one of the concerns, but it's not the only one. although you might be able consider the others to be special cases of inlining, such as how a macro can be used to inline a function call).

axsk commented 7 years ago

I feel the urge to reopen this issue. We have precompilation for a while now, but so far I couldn't precompile a single package of mine because of some precompile-unsafe dependencies, whose code isn't even called. For my current (smaller) project this results in startup times of 30 seconds (not what I'd expect to be in 2x C range)...

I really hope we won't just settle with a "it's not doable any better".

StefanKarpinski commented 7 years ago

If you don't call code from the dependencies, why are they dependencies?

axsk commented 7 years ago

They are dependencies (b) of dependencies (a), of which (a) I only use a small parts, which themselves don't rely on (b).

On 28 Jul 2017 3:20 p.m., "Stefan Karpinski" notifications@github.com wrote:

If you don't call code from the dependencies, why are they dependencies?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaLang/julia/issues/12545#issuecomment-318649879, or mute the thread https://github.com/notifications/unsubscribe-auth/AFh7hisSdvUCnWS8mf56MRAv0Nl1JDGTks5sSeAOgaJpZM4FpDl- .

StefanKarpinski commented 7 years ago

That's annoying, but it seems like a package factoring problem, not a language problem.

stevengj commented 7 years ago

@axsk, out of curiosity, what common dependency these days is not precompile-safe?

musm commented 7 years ago

the biggest one that I am aware of is LibExpat

vtjnash commented 7 years ago

Requires.jl exists to also help with that problem

stevengj commented 7 years ago

(Note that LibExpat is now precompile-safe as of JuliaIO/LibExpat.jl#83. Because making packages precompile-safe is generally straightforward, I don't think "patch your dependencies" is too onerous a solution to this issue.)