JuliaPy / PythonCall.jl

Python and Julia in harmony.
https://juliapy.github.io/PythonCall.jl/stable/
MIT License
811 stars 64 forks source link

Precompile statements (and error) #13

Closed PallHaraldsson closed 3 years ago

PallHaraldsson commented 3 years ago

Hi, I CAN install your package on a clean install (not otherwise, see below), but as you know startup is rather slow.

A. I added precompile statements and got down to (still higher than what I would like to see):

$ ~/julia-1.6.0-rc2/bin/julia -O0 --compile=min --startup-file=no -q
julia> @time using PythonCall
  2.384556 seconds (2.29 M allocations: 143.823 MiB, 1.21% gc time)

Still rather bad:
$ ~/julia-1.6.0-rc2/bin/julia -O0  --startup-file=no
julia> @time using PythonCall
 17.399345 seconds (44.16 M allocations: 2.541 GiB, 6.10% gc time, 0.02% compilation time)

and even worse:
$ ~/julia-1.6.0-rc2/bin/julia --startup-file=no -q
julia> @time using PythonCall
 26.929030 seconds (44.16 M allocations: 2.541 GiB, 3.96% gc time, 0.02% compilation time)

compared to your package as is:

julia> @time using PythonCall
 31.716683 seconds (55.21 M allocations: 3.177 GiB, 5.03% gc time, 0.02% compilation time)

$ ~/julia-1.6.0-rc2/bin/julia -O0 --compile=min --startup-file=no -q
julia> @time using PythonCall
  6.888381 seconds (19.04 M allocations: 1.109 GiB, 9.42% gc time)

I used a naive way, not SnoopCompile.jl, so maybe it's better. I'm also thinking, do you need to do this much at init? Can you postpone some things, and do later lazily?

B. The above worked with the new Julia 1.6.0-rc2, but with my original .julia folder I get (on that version and later, including 1.53):

(@v1.6) pkg> add PythonCall.jl
    Updating registry at `~/.julia/registries/General`
ERROR: refusing to add package `PythonCall [6099a3de]`: package `PythonCall = "6099a3de-0909-46bc-b1f4-468b9a2dfc0d"` already exists as a direct dependency

Do you have any idea what it means (this is before my work with precompile statements)?

At least I thought you would like to know, as your users might encounter this, and I have no idea where to begin to debug this. Except, while it doesn't seem to strictly be about your package (or is it?), only an interaction with my (broken?) setup:

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [7d9fca2a] Arpack v0.5.1
  [1dc0ca97] ArrayTools v0.1.0 `https://github.com/emmt/ArrayTools.jl#master`
  [77e5a97a] AsyPlots v0.2.1
  [c52e3926] Atom v0.12.30
  [b6338580] BPFnative v0.1.0 `https://github.com/jpsamaroo/BPFnative.jl.git#master`
  [6e4b80f9] BenchmarkTools v0.5.0
  [7cffe744] BetterExp v0.1.0 `https://github.com/oscardssmith/BetterExp.jl#master`
  [7f725544] BinaryBuilderBase v0.5.0 `https://github.com/JuliaPackaging/BinaryBuilderBase.jl.git#master`
  [ad839575] Blink v0.12.4
  [336ed68f] CSV v0.8.3
  [159f3aea] Cairo v1.0.5
  [49dc2e85] Calculus v0.5.1
  [710b7bb7] ClearStacktrace v0.2.2
  [aaaa29a8] Clustering v0.14.2
  [da1fd8a2] CodeTracking v1.0.5
  [944b1d66] CodecZlib v0.7.0
  [35d6a980] ColorSchemes v3.10.2
  [5ae59095] Colors v0.12.6
  [8f4d0f93] Conda v1.5.0
  [1b08a953] Dash v0.1.3
  [03207cf0] DashBase v0.1.0 `https://github.com/plotly/DashBase.jl.git#master`
  [1b08a953] DashCoreComponents v1.12.0
  [1b08a953] DashHtmlComponents v1.1.1
  [1b08a953] DashTable v4.10.1
  [9024f26f] Dashboards v0.2.8
  [a93c6f00] DataFrames v0.22.5 `https://github.com/JuliaData/DataFrames.jl.git#main`
  [1313f7d8] DataFramesMeta v0.6.0
  [864edb3b] DataStructures v0.17.20
  [55939f99] DecFP v1.1.0
  [85a47980] Dictionaries v0.3.7
  [a1bb12fb] Electron v3.1.1
  [d872a56f] ElectronDisplay v1.0.1-DEV `https://github.com/queryverse/ElectronDisplay.jl.git#master`
  [c04bee98] ExcelReaders v0.11.0
  [53c48c17] FixedPointNumbers v0.8.4
  [186bb1d3] Fontconfig v0.4.0
  [5752ebe1] GMT v0.29.0
  [28b8d3ca] GR v0.55.0
  [c91e804a] Gadfly v1.3.1
  [4b11ee91] Gaston v1.0.2 `https://github.com/mbaz/Gaston.jl.git#master`
  [c43c736e] Genie v1.15.1
  [bc5e4493] GitHub v5.4.0
  [dc211083] Gnuplot v1.3.0
  [4c0ca9eb] Gtk v1.1.6
  [cd3eb016] HTTP v0.8.19
  [60101457] ICOADSDict v0.1.0
  [7073ff75] IJulia v1.23.2
  [e8efc688] ImPlot v0.1.1
  [5903a43b] Infiltrator v0.3.0
  [1c8ee90f] IterableTables v1.0.0
  [82899510] IteratorInterfaceExtensions v1.0.0
  [4138dd39] JLD v0.12.1
  [682c06a0] JSON v0.21.1
  [2535ab7d] JSON2 v0.3.2
  [0f8b85d8] JSON3 v1.7.0
  [aa1ae85d] JuliaInterpreter v0.8.9
  [75827d1f] JuliaTutor v0.1.0 `https://github.com/caseykneale/JuliaTutor.jl#master`
  [e5e0dc1b] Juno v0.8.4
  [fc18253b] LazyJSON v0.2.2
  [194296ae] LibPQ v1.6.2
  [23992714] MAT v0.10.1
  [10e44e05] MATLAB v0.8.0
  [1914dd2f] MacroTools v0.5.6
  [739be429] MbedTLS v1.0.3
  [c03570c3] Memoize v0.4.4
  [13650a8a] NicePipes v0.1.3
  [47be7bcc] ORCA v0.5.0
  [5fb14364] OhMyREPL v0.5.8 `~/.julia/dev/OhMyREPL`
  [3ad7be9e] OpenCV v0.1.1 `https://github.com/TakekazuKATO/OpenCV.jl#master`
  [bac558e1] OrderedCollections v1.4.0
  [ace2c81b] PETSc v0.1.0 `https://github.com/JuliaParallel/PETSc.jl#master`
  [06a5ddd6] PGPlot v0.1.0 `https://github.com/emmt/PGPlot.jl#master`
  [9b87118b] PackageCompiler v1.2.5
  [eadc2687] Pandas v1.4.0 `https://github.com/PallHaraldsson/Pandas.jl#master`
  [f5117550] PandasLite v0.1.7
  [69de0a69] Parsers v1.0.15
  [32113eaa] PkgBenchmark v0.2.10
  [f3e62ec7] PkgCleanup v0.1.0 `https://github.com/giordano/PkgCleanup.jl#main`
  [58dd65bb] Plotly v0.3.0
  [a03496cd] PlotlyBase v0.3.1
  [f0f68f2c] PlotlyJS v0.13.1
  [91a5bcdd] Plots v1.10.6
  [8e5c59b7] PlusPlus v0.1.0
  [c3e4b0f8] Pluto v0.12.21
  [c46f51b8] ProfileView v0.6.9
  [438e738f] PyCall v1.92.2 `https://github.com/JuliaPy/PyCall.jl.git#master`
  [d330b81b] PyPlot v2.9.0 `~/.julia/dev/PyPlot`
  [6099a3de] Python v0.1.0 `https://github.com/cjdoris/Python.jl#master`
  [ce6b1742] RDatasets v0.7.4
  [295af30f] Revise v3.1.12
  [22415677] RobotOS v0.7.2
  [5e6a0ad9] SafeREPL v0.0.1 `https://github.com/rfourquet/SafeREPL.jl#master`
  [88634af6] SaferIntegers v2.5.1
  [1277b4bf] ShiftedArrays v1.0.0
  [01919df6] SimpleTypePrint v0.1.2
  [aa65fe97] SnoopCompile v2.5.2
  [90137ffa] StaticArrays v0.12.5
  [4acbeb90] Stipple v0.6.0
  [30ddb3f0] StippleCharts v0.4.1
  [ec984513] StipplePlotly v0.1.0 `https://github.com/GenieFramework/StipplePlotly.jl#main`
  [a3c5d34a] StippleUI v0.2.4 `https://github.com/GenieFramework/StippleUI.jl.git#master`
  [88034a9c] StringDistances v0.8.0
  [69024149] StringEncodings v0.3.3
  [b9c42661] SwapLiterals v0.1.0 `https://github.com/rfourquet/SafeREPL.jl:SwapLiterals#master`
  [382cd787] TableTraitsUtils v1.0.1
  [05f542c5] TestGtkMegaJLL v0.1.0 `~/.julia/dev/TestGtkMegaJLL`
  [db00978d] TightBinding v0.1.3
  [37f6aa50] TikzPictures v3.3.1 `https://github.com/JuliaTeX/TikzPictures.jl.git#master`
  [30578b45] URIParser v0.4.1
  [112f6efa] VegaLite v2.4.1
  [81def892] VersionParsing v1.2.0
  [5e69872b] Xcb v0.2.0 `https://gitlab.com/poposca/Xcb.jl.git#master`
  [77ec8976] GTK3_jll v3.24.11+5
  [2e76f6c2] HarfBuzz_jll v2.6.1+10
  [9c32591e] Poppler_jll v0.87.0+2
  [ab825dc5] SDL2_jll v2.0.12+2
  [502467ad] UBPF_jll v0.0.1+0
  [3f19e933] p7zip_jll
cjdoris commented 3 years ago

You still have Python.jl installed from before the name change. It has the same UUID, so I imagine that is the problem. I think the error message has a bug though, in that Julia should report the old name of the package.

cjdoris commented 3 years ago

Thanks for looking at startup time. It's definitely something I want to improve. I believe the main contributors are (a) time spent compiling methods for the juliacall.*Value types, and (b) time spent compiling conversion rules. These both compile C functions which is quite slow in Julia. These are both compiled when used, but the init function uses some of them. The ones in (a) could be more lazy - we compile all methods for a type the first time the type is used. Might be possible to compile each method on first use. In (b) I use C functions as an optimisation, which speeds up conversion but has a JIT cost. This could be reverted, or JITed in the background perhaps. Would be good to take a proper look with SnoopCompile but I don't have time right now.

cjdoris commented 3 years ago

As well as precompile, nospecialize and optlevel may be useful.

cjdoris commented 3 years ago

Just committed some time-savers:

cjdoris commented 3 years ago

Just saved another 4-5 seconds by precompiling the init function! Now ~7.5 seconds.

SnoopCompile suggests we can get another 2 seconds by precompiling some things to do with the juliacall.*Value types.

cjdoris commented 3 years ago

Closing this as it's about a previous incarnation of the package.