comonicon / Comonicon.jl

Your best CLI generator in JuliaLang
https://comonicon.org
MIT License
282 stars 25 forks source link

[BUG] Infinite build time on Julia 1.10 #249

Closed MilesCranmer closed 1 year ago

MilesCranmer commented 1 year ago

Describe the bug

I'm trying to build AirspeedVelocity.jl on Julia 1.10-beta1 (macOS aarch64) but am encountering infinite build time. Earlier Julia versions do not have this issue.

Version of the package: Comonicon.jl 1.0.5

These are the dependencies of AirspeedVelocity 0.4.2:

BenchmarkTools = "1"
Chain = "0.4, 0.5"
Comonicon = "1"
FilePathsBase = "0.9"
JSON3 = "1"
OrderedCollections = "1"
PlotlyLight = "0.6"
PlotlyKaleido = "1"
julia = "1.7"

To Reproduce

julia +1.10 -e 'using Pkg; Pkg.add("AirspeedVelocity")'

Screenshots

Screenshot 2023-08-08 at 10 21 43

Here is the log:

[pid 61578] waiting for IO to finish:
 TYPE[FD/PID]       @UV_HANDLE_T->DATA
 pipe[16]           @0x112e2e790->0x104604bf0
 pipe[10]           @0x112e92600->0x104604c90
 process[61589]     @0x112e93dd0->0x11423eaf0

which repeats over time.

Roger-luo commented 1 year ago

Perhaps should also report in julia? Since it was working in previous versions. I'm not sure if this is something on Comonicon side but if it is I guess it might be the eval call in toplevel after generating all the commands

timholy commented 1 year ago

If you open an I/O resource during module definition, you now (Julia 1.10+) have to close it before exiting the module. Example of a fix:

I can't remember if the 2nd PR was needed to fix precompilation, but I don't think so. But digging turned up more evidence that we were doing a bad job of freeing resources, so responding to the changes in 1.10 made the package better.

MilesCranmer commented 1 year ago

Thanks, that is very helpful.

@Roger-luo I looked through AirspeedVelocity.jl and there aren't any I/O resources opened external to individual methods, so I think it might be in Comonicon.jl? Are there any that you know of?

MilesCranmer commented 1 year ago

Actually I guess since .comonicon_install() automatically runs precompile on the @main methods, it could just be on the methods that is precompiled and doesn't close a resource?

MilesCranmer commented 1 year ago

Okay, I found the issue by doing a binary search of commenting out code.

It turns out it's not Comonicon.jl at all, sorry @Roger-luo!

It turns out to be the fault of PlotlyKaleido.jl, which is a dependency. Apparently this package has some I/O resource hanging around. Somehow that package managed to precompile on its own, but using it as a dependency made precompilation hang...

The following code gives the exact same bug I was seeing:

module AirspeedVelocity

using PlotlyKaleido

end
MilesCranmer commented 1 year ago

For posterity:

I fixed it in the dependency with this: https://github.com/JuliaPlots/PlotlyKaleido.jl/pull/6

However there was really no easy way of finding this besides randomly turning on/off parts of code. The difficulty is compounded by the fact that only calling using PlotlyKaleido inside a package triggered the precompilation error, but neither (a) running that in the REPL, nor (b) precompiling PlotlyKaleido itself, was enough to trigger the error. It was only when it was used as a dependency.

In case others stumble upon this thread, what's the best way of tracking something like this down? Or maybe such a thing could be added to v1.10?

timholy commented 1 year ago

Yeah I do wish we had better tools than binary-search-commenting. Any ideas, @vtjnash?

vtjnash commented 1 year ago

I was expecting people would see the output "waiting for IO" and think "Huh, why is it waiting for process 61589 IO, \<checks htop>, which is the foobar process, \<checks grep>, and seems to be run from the FooBars package. Why was that running during precompile and not being cleaned up properly by FooBars? \<files issue with FooBars package to manage resources better>"

timholy commented 1 year ago

It's pretty easy to think of that as some internal token and not realize you might be able to get more information with OS tools like htop. I wonder if we should write up a brief devdocs on how you identify and isolate these? For example, I'd have no clue how one goes about doing similar diagnostics on Windows. Once the devdocs is in place, maybe we could link to it in the scary message that gets produced (the first time)?

FWIW, I didn't know about htop until just now. I suppose ps could be used similarly?

timholy commented 1 year ago

I'd be happy to help craft such devdocs, but I've not used those techniques myself. Do you happen to know of another package that needs this kind of fix? I could use it as a testcase.

vtjnash commented 1 year ago

People were recently complaining about PyPlot at JuliaCon: https://github.com/JuliaPy/PyPlot.jl/issues/572. Though finding the source of the bogus sleep call in that package might be a bit more of a challenge.

MilesCranmer commented 1 year ago

For that PyPlot.jl issue it looks like it's the same issue to PlotlyKaleido.jl of turning on a plotting backend at the __init__ step:

https://github.com/JuliaPy/PyPlot.jl/blob/343430ac4f822a84a5e19bf5dabf7a846b171c9b/src/init.jl#L171-L209

Seelengrab commented 1 year ago

I was expecting people would see the output "waiting for IO" and think "Huh, why is it waiting for process 61589 IO, , which is the foobar process, , and seems to be run from the FooBars package.

That's quite a lot of assumed knowledge about how this works - not very user friendly/actionable :frowning_face: