henry2004y / Vlasiator.jl

Data processor for Vlasiator
https://henry2004y.github.io/Vlasiator.jl/stable/
MIT License
6 stars 4 forks source link

Competitor for PyCall and PyPlot #69

Closed henry2004y closed 1 year ago

henry2004y commented 2 years ago

There are some new packages built by Christopher Rowley: PythonCall.jl, CondaPkg.jl, and JuliaCall. If things go smoothly, we may consider shifting to the new APIs.

henry2004y commented 2 years ago

See also #71

henry2004y commented 2 years ago

Now I tend to think that calling Matplotlib through PythonCall may not be a good option for us. The current PythonCall settings by default install a new Conda which is not what we want in most cases. It seems a little bit tedious to figure out the proper installation settings.

However, a Python interface to Vlasiator.jl is definitely a nice complement to the existing Julia functionalities. But just as I mentioned in #71, it may be better to create a new Python library for that, which makes it easier for maintenance. A starting point for that attempt can be found here.

henry2004y commented 2 years ago

Steve has recently been working on a new drop-in replacement of PyPlot.jl: PythonPlot.jl. This almost shares the same APIs as its predecessor, but has fewer type conversions and copies happening in the background. Once this new package is registered, we can shift.

Actually, with PythonPlot.jl I have already built a way to directly use Matplotlib. There is an issue with this approach: interactive plotting requires a proper setup of eventloop, as can be seen in pygui.jl. Long story short, the following code won't work

if Base.isinteractive()
   plt.ion()
else
   plt.ioff()
end

unless you also create an event loop for your target backend, like PyPlot/PythonPlot does.


If we use CondaPkg.jl, a .CondaPkg folder will be created for the first time in a new Julia environment, such as when we develop the package locally using activate. I guess if the user has CondaPkg already installed, this step is no longer needed?

henry2004y commented 2 years ago

I had some issues passing Julia objects to Python before, and I ended up creating full vectors and arrays in the middle to make the APIs happy. Maybe with the aid of PythonCall.jl I can save some time and memory usage as well!


This page describes the common issues of using Matplotlib via PythonCall.


Time-To-First-Plot issue is more severe for PythonPlot, and somehow surprisingly, it uses more memory?

# This is required, otherwise by default PythonPlot will use CondaPkg to install a standalone Matplotlib.
ENV["JULIA_PYTHONCALL_EXE"] = "@PyCall"
using PythonPlot
#using PyPlot

x = range(0; stop=2*pi, length=1000); y = sin.(3 * x + 4 * cos.(2 * x));
plot(x, y, color="red", linewidth=2.0, linestyle="--")
title("A sinusoidally modulated sinusoid")
## PythonPlot
# 1st time
11.781756 seconds (17.62 M allocations: 970.371 MiB, 3.82% gc time, 83.03% compilation time: 47% of which was recompilation)
# 3rd time
0.052265 seconds (919 allocations: 315.742 KiB)
julia> @time plot(x, y, color="red", linewidth=2.0, linestyle="--")
0.048247 seconds (425 allocations: 254.438 KiB)

## PyPlot
# 1st time
3.968411 seconds (6.85 M allocations: 407.435 MiB, 6.17% gc time, 68.61% compilation time)
# 3rd time
0.049111 seconds (1.55 k allocations: 79.117 KiB)
julia> @time plot(x, y, color="red", linewidth=2.0, linestyle="--")
0.042623 seconds (1.09 k allocations: 19.508 KiB)

For the pythonplot branch,

# pythonplot
julia> @time using PythonPlot
 10.042735 seconds (14.86 M allocations: 797.975 MiB, 4.17% gc time, 83.14% compilation time: 52% of which was recompilation)

julia> @time pcolormesh(meta, "proton/vg_rho")
  2.309976 seconds (5.47 M allocations: 278.132 MiB, 4.24% gc time, 88.39% compilation time: 13% of which was recompilation)
Python QuadMesh: <matplotlib.collections.QuadMesh object at 0x7f75699ee390>

julia> @time pcolormesh(meta, "proton/vg_rho")
  0.082580 seconds (710 allocations: 301.258 KiB)

# pyplot
julia> @time using PyPlot
  5.146423 seconds (7.25 M allocations: 391.121 MiB, 6.19% gc time, 76.74% compilation time: 60% of which was recompilation)

julia> @time pcolormesh(meta, "proton/vg_rho")
  3.162250 seconds (6.92 M allocations: 354.662 MiB, 3.35% gc time, 91.74% compilation time: 2% of which was recompilation)
PyObject <matplotlib.collections.QuadMesh object at 0x7f42226d3908>

julia> @time pcolormesh(meta, "proton/vg_rho")
  0.079585 seconds (901 allocations: 159.055 KiB)
PyObject <matplotlib.collections.QuadMesh object at 0x7f4222145198>

So currently (PythonCall v0.9) about 5 s more in loading the package but then about 1 s less in time-to-first-plot.