Closed MilesCranmer closed 2 years ago
Another thing to try would be PyJulia. Then much of the Julia backend would get cached, if Python stays open between commands.
The main reason I haven't used PyJulia so far is because of installation issues I've personally experienced (which therefore would likely be experienced by many users who have never used Julia). Another reason is I'm not sure how it would handle distributed computing - where it seems better to launch Julia from the command line normally (which is how PySR works).
Update. This seems to be working. I have the current draft version in the pyjulia
branch. This should mean you can get faster startup time on second call, since you don't need to recompile the Julia backend every single launch - it will just cache the SymbolicRegression.jl from the previous pysr
call.
Edit: confirmed that there is a much faster startup time.
PyJulia is working extremely well, even with distributed computing(!). While PyJulia doesn't even officially support this, it seems to work because the backend handles all distributed processes internally.
I will likely switch the entirety of PySR to PyJulia in a future version. In addition to the reduced startup time from repeat searches, another major advantage is I can finally have state-saving abilities, and store the equations directly in a Python object rather than in a csv file.
This is fixed with v0.7+
I think it would be great to precompile some parts of SymbolicRegression.jl to reduce the startup time of PySR. I think this could help startup time quite significantly.
Tutorial: https://julialang.org/blog/2021/01/precompile_tutorial/