Open marcoskirsch opened 4 years ago
The most likely reason I see of why niswitch
and nifgen
are pretty slow compared to the rest is that they open simulated sessions to devices whose driver runtime uses the DAQmx framework under the hood.
This is:
System tests for each job runs in a separate agent, but we are running tests for 5 versions of Python in parallel. That would be a lot of multi-second tests gettting serialized!
The mechanism in which tests that use session-based simulation with DAQmx framework devices are serialized is a global lock we call daqmx_sim_db_lock
. Just grep for that keyword.
This is only used in src/nifgen/system_tests/test_system_nifgen.py
and src/niswitch/system_tests/test_system_niswitch.py
.
So a potential way to make these tests faster are:
I looked at the nimi-bot logs from PR #1268. For example, in job win32/job/niswitch/274/console there are some obvious offenders:
========================== slowest 5 test durations ===========================
223.97s call src/niswitch/system_tests/test_system_niswitch.py::test_continuous_software_scanning
153.20s call src/niswitch/system_tests/test_system_niswitch.py::test_enum_attribute
0.12s setup src/niswitch/system_tests/test_system_niswitch.py::test_relayclose
0.08s call src/niswitch/system_tests/test_system_niswitch.py::test_write_only_attribute
0.07s call src/niswitch/system_tests/test_system_niswitch.py::test_error_message
As expected, those are tests that use session-based simulation of a device that uses the DAQmx driver framework. But they should really not so long. One mistake is that the global lock we introduced is only needed during Session creation and destruction / close. But we hold on to it for the duration of the test.
We should have a goal that no test takes over 1 second.
The intermittent failure due to simulated device creation race conditions is being tracked in the internal NI Bug #AB1235678. Fixing that should also improve the test times of those that use session-based simulation of devices using the DAQmx driver framework.
Following data based on latest runs using new nimibot system (AWS hosted VMs + GitHub Actions). Worst offenders are:
This are the ones we should focus on optimizing because they are the long poles at the moment.
nimi-bot system tests now run in parallel, with a job for each bitness (win32 vs win64) and for each module (nidmm, niscope, nidcpower, nise, etc.)
I noticed some jobs consistently take a very long time compared to the rest. Take the system tests run as part of #1258:
By far the worse offender is
niswitch
which takes over twice as long as the rest of them. We really need to look into what's causing it and fix it.