pytest-dev / pytest-xdist

pytest plugin for distributed testing and loop-on-failures testing modes.
https://pytest-xdist.readthedocs.io
MIT License
1.44k stars 227 forks source link

change_sys_path breaking virtualenv #1049

Open yurikhan opened 5 months ago

yurikhan commented 5 months ago

Hello there.

I’m migrating a project that has been neglected for a while from pytest-xdist 1.30.0 to something less ancient and seeing a regression.

The project has a suite of tests that take quite a while to run. So we spawn a number of LXD containers, set up SSH on them, and use pytest-xdist to run tests in parallel on the lot of them:

lxc exec myproj.lxd -- py.test […] \
    --tx popen//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-0.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-1.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-2.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-3.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-4.lxd//python=/usr/lib/myproj/venv/bin/python \
    […]

The code under testing is installed in a virtualenv and we specify the full path to the virtualenv’s python executable. Issue #300 says this is expected to work.

The observed behavior is that during test collection stage the gw0 node fails to import modules installed in the venv, which leads to a discrepancy in collection results between gw0 and each of the rest of nodes. A debug import sys; print(sys.path) at the top of the file containing tests shows a list of directories that a non-virtualenv python would have.

Further debugging has led me to #667. As I understand it, this saves the value of sys.path at the start of the main process, then restores that into each popen worker but not ssh ones. I have tried augmenting the sys.path = change_sys_path line with debug output and, sure enough, I see this:

-/root
-/usr/local/lib/python3.10/dist-packages
-/usr/local/lib/python3.10/dist-packages
-/usr/local/lib/python3.10/dist-packages
-
+/usr/local/bin
 /usr/lib/python310.zip
 /usr/lib/python3.10
 /usr/lib/python3.10/lib-dynload
-/usr/lib/uss-lib/venv/lib/python3.10/site-packages
-/usr/lib/uss-lib/venv/lib/python3.10/dist-packages
 /usr/local/lib/python3.10/dist-packages
 /usr/lib/python3/dist-packages

As a workaround, I can change popen// to ssh=myproj.lxd// so that the master py.test talks to all workers over ssh, even the local one:

lxc exec myproj.lxd -- py.test […] \
    --tx ssh=myproj.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-0.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-1.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-2.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-3.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-4.lxd//python=/usr/lib/myproj/venv/bin/python \
    […]

As another workaround I could probably run pytest using the master node’s virtualenv’s python:

lxc exec myproj.lxd -- /usr/lib/myproj/venv/bin/python -m pytest […] \
    --tx popen//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-0.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-1.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-2.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-3.lxd//python=/usr/lib/myproj/venv/bin/python \
    --tx ssh=myproj-4.lxd//python=/usr/lib/myproj/venv/bin/python \
    […]

The trashing of the virtualenv’s search path is quite surprising though.

RonnyPfannschmidt commented 5 months ago

I believe this one might be a bug in remote execnet startup

More investigation is needed

yurikhan commented 5 months ago

I checked and the worker’s sys.path was correct (i.e. that of the virtualenv) right up to the clobbering by this line. After, it was the generic system-wide python path.

RonnyPfannschmidt commented 5 months ago

Thanks for the investigation

I'll have to familiarize with that part