robotpy / mostrobotpy

Official Repository of python implementation of WPILib components
https://robotpy.github.io
Other
9 stars 10 forks source link

RobotPy startup on a RoboRIO is slow #33

Open virtuald opened 7 years ago

virtuald commented 7 years ago

On my RoboRIO with a 2016 image, I get the following (mostly consistent timing between runs, no DS connected):

roboRIO-2423-FRC:~$ time python3 -c 'import hal'

real    0m4.954s
user    0m1.280s
sys 0m0.170s
roboRIO-2423-FRC:~$ time python3 -c 'import wpilib'

real    0m6.984s
user    0m3.270s
sys 0m0.260s

HAL library loading may be a big chunk of this:

roboRIO-2423-FRC:~$ time python3 -c 'import ctypes; ctypes.CDLL("/usr/local/lib/python3.5/site-packages/hal_impl/libHALAthena.so")'

real    0m3.093s
user    0m0.470s
sys 0m0.070s

As a baseline:

roboRIO-2423-FRC:~$ time python3 -c 'pass'

real    0m0.210s
user    0m0.180s
sys 0m0.020s
roboRIO-2423-FRC:~$ time python3 -c 'import pip'

real    0m3.002s
user    0m2.760s
sys 0m0.220s

It would be good to reduce the startup time somehow.

PeterJohnson commented 7 years ago

Given how much time is spent importing the hal, I suspect a lot of time is spent in ctypes overhead. Creating a Python C module shim .so might speed this up considerably.

virtuald commented 7 years ago

If one were to go that route, I'd want to use cython to generate that I think.

auscompgeek commented 7 years ago

Is it possible that cffi would be be faster than ctypes for hal?

virtuald commented 7 years ago

Wouldn't cffi require compilation of some kind?

virtuald commented 7 years ago

The 2017 HAL + Python 3.6 is significantly faster:

admin@roboRIO-2423-FRC:~# time python3 -c 'import hal'

real    0m2.089s
user    0m0.940s
sys 0m0.170s
admin@roboRIO-2423-FRC:~# time python3 -c 'import wpilib'

real    0m4.119s
user    0m3.570s
sys 0m0.220s

But pip is bigger:

admin@roboRIO-2423-FRC:~# time python3 -c 'pass'

real    0m0.196s
user    0m0.190s
sys 0m0.000s
admin@roboRIO-2423-FRC:~# time python3 -c 'import pip'

real    0m4.159s
user    0m3.900s
sys 0m0.240s
virtuald commented 6 years ago

Well. HAL seems to be faster this year. But wpilib is about the same... I imagine it has to be due to all of the imports we're doing. We could probably do better without importing everything... maybe some kind of lazy importing?

admin@roboRIO-2423-FRC:~# time python3 -c 'import hal'

real    0m1.202s
user    0m0.860s
sys 0m0.310s
admin@roboRIO-2423-FRC:~# time python3 -c 'import wpilib'

real    0m4.028s
user    0m3.600s
sys 0m0.390s

Also, I pushed a compiled version of cffi to the -dev repo, if someone wants to experiment with that.

auscompgeek commented 6 years ago

I'd be interested to see exactly how much of the HAL is actually just ctypes overhead and how much is initialising the HAL.

I might try my hand on lazy-loading wpilib.

auscompgeek commented 5 years ago

Update: I gave up on lazy-loading wpilib in robotpy/robotpy-wpilib#468. If anyone else has good ideas for that…

I've had some luck using cffi for HAL bindings after running HAL.h through the C preprocessor. Someone should actually do it and benchmark it.

auscompgeek commented 5 years ago

Faster this year still:

lvuser@roboRIO-4774-FRC:~$ time python3 -c 'import hal'

real    0m0.953s
user    0m0.800s
sys     0m0.140s
lvuser@roboRIO-4774-FRC:~$ time python3 -c 'import wpilib'

real    0m3.016s
user    0m2.760s
sys     0m0.230s
lvuser@roboRIO-4774-FRC:~$ time python3 -c 'pass'

real    0m0.154s
user    0m0.120s
sys     0m0.030s
lvuser@roboRIO-4774-FRC:~$ time python3 -m pip > /dev/null

real    0m3.334s
user    0m3.090s
sys     0m0.220s

But no pyc files are compiled with -O, so we get this:

lvuser@roboRIO-4774-FRC:~$ time python3 -O -c 'import wpilib'

real    0m4.261s
user    0m3.920s
sys     0m0.310s
virtuald commented 4 years ago

Here's HAL for this year, looks like it's a bit slower:

time python3 -c 'import hal'

real    0m0.978s
user    0m0.637s
sys     0m0.259s
virtuald commented 4 years ago

Full results:

lvuser@roboRIO-2423-FRC:~$ time python3 -c 'import hal'

real    0m0.981s
user    0m0.637s
sys     0m0.262s
lvuser@roboRIO-2423-FRC:~$ time python3 -c 'import wpilib'

real    0m2.606s
user    0m2.213s
sys     0m0.504s
lvuser@roboRIO-2423-FRC:~$ time python3 -c 'pass'

real    0m0.154s
user    0m0.121s
sys     0m0.032s
lvuser@roboRIO-2423-FRC:~$ time python3 -m pip > /dev/null

real    0m3.927s
user    0m3.625s
sys     0m0.281s
auscompgeek commented 4 years ago

It has occurred to me that a large chunk of the import time is probably pybind11 creating docstrings. Would probably be good if that could get deferred until they're actually accessed.

(On that note, do we actually need to have pybind11 bindings to the HAL functions?)

virtuald commented 3 years ago

Looks like we're in the same ballpark this year:

lvuser@roboRIO-2423-FRC:~$ time python3 -c 'import hal'

real    0m1.041s
user    0m0.708s
sys 0m0.227s
lvuser@roboRIO-2423-FRC:~$ time python3 -c 'import wpilib'

real    0m2.874s
user    0m2.482s
sys 0m0.531s
lvuser@roboRIO-2423-FRC:~$ time python3 -c 'pass'

real    0m0.160s
user    0m0.148s
sys 0m0.011s
lvuser@roboRIO-2423-FRC:~$ time python3 -m pip > /dev/null

real    0m2.115s
user    0m1.953s
sys 0m0.150s
auscompgeek commented 6 months ago

It has occurred to me that a large chunk of the import time is probably pybind11 creating docstrings. Would probably be good if that could get deferred until they're actually accessed.

It turns out that nanobind defers docstring rendering, where pybind11 doesn't: https://nanobind.readthedocs.io/en/latest/why.html#major-additions

Unfortunately I think we're stuck with pybind11 whilst we still need multiple inheritance: https://nanobind.readthedocs.io/en/latest/porting.html#removed-features