micropython / micropython-lib

Core Python libraries ported to MicroPython
Other
2.3k stars 980 forks source link

unittest/discover: Reverse order of parent/top in sys.path. #872

Closed andrewleech closed 1 minute ago

andrewleech commented 3 weeks ago

When running tests from subfolders, import by "full dotted path" rather than just module name, removing the need to add the test parent folder to sys.path.


I found a problem with unittest-discover in a project which uses microdot. A recent update in microdot includes a test_client.py in the src folder, which caused it to be pulled in when freezing the module via manifest, and this test file was also then run with unittest discover.

The issue is related to import path ordering. The unit test in question starts with from microdot.microdot import Request, Response, AsyncBytesIO which seems reasonable. The microdot package has this microdot/microdot.py file and that import line works fine normally.

However currently, when unittest discover finds a test file in a subdirectory https://github.com/micropython/micropython-lib/blob/50ed36fbeb919753bcc26ce13a8cffd7691d06ef/python-stdlib/unittest-discover/unittest/__main__.py#L48

it inserts the test file parent folder path at the start of sys.path, followed by top, resulting in an import path like ['./microdot', '.', '', '.frozen', '/home/.micropython/lib', '/usr/lib/micropython'] This means from this point, an import microdot is finding the microdot.py file, not the parent microdot package. Hence, import microdot.microdot fails.

I believe I wrote the existing code with reversed on the path insertions by mistake and it's only now I've found a situation that shows the bug. As such I think it makes the most sense to have top as the first element on path, espectially as this is a value that can be directly overridden on the command line when running unittest discover, so putting it highest priority gives the user the best flexibility.

stinos commented 3 weeks ago

Any idea what CPython does here?

andrewleech commented 3 weeks ago

Any idea what CPython does here?

Yeah I wanted to check that myself, it should be trivial for me to make an example test case to demonstrate this

andrewleech commented 3 weeks ago
[corona@Telie unittest_paths]$ pwd
/home/corona/unittest_paths
[corona@Telie unittest_paths]$ tree
.
└── package
    ├── __init__.py
    ├── package.py
    └── test_pkg.py
2 directories, 3 files
[corona@Telie unittest_paths]$ head -n 2 package/test_pkg.py
import sys
print(sys.path)
[corona@Telie unittest_paths]$ micropython -m unittest
['./package', '.', '', '.frozen', '/home/corona/.micropython/lib', '/usr/lib/micropython']
Traceback (most recent call last):
  File "unittest/__main__.py", line 146, in <module>
  File "unittest/__main__.py", line 21, in _run_test_module
  File "./package/test_pkg.py", line 5, in <module>
ImportError: no module named 'package.package'
[corona@Telie unittest_paths]$
[corona@Telie unittest_paths]$ python -m unittest
['/home/corona/unittest_paths', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', '/home/corona/.local/lib/python3.11/site-packages', '/usr/lib/python3.11/site-packages']

.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

TLDR: it seems cpython places abspath (top) at the start of sys.path but doesn't include the test file parent dir at all.

stinos commented 3 weeks ago

That seems logical enough. Imo micropython should follow that behavior?

andrewleech commented 3 weeks ago

Yep I was just about to remove the parent folder from the path, then remembered why it's there. https://github.com/micropython/micropython-lib/blob/50ed36fbeb919753bcc26ce13a8cffd7691d06ef/python-stdlib/unittest-discover/unittest/__main__.py#L21

The test file is imported here by name and will only be found if its parent dir is on the path. I don't know of there is a different import mechanism / args in micropython that would allow this to work without the folder on path.

dpgeorge commented 2 weeks ago

As discussed, see if it's possible to add only abspath(top) to sys.path, and then prepend dot-separated dirs to the module_name to import it relative to top, eg __import__(".".join(parent_dirs) + "." + module_name).

andrewleech commented 2 weeks ago

prepend dot-separated dirs to the module_name to import it relative to top, eg __import__(".".join(parent_dirs) + "." + module_name).

I've tested this and yes it does work, in a roundabout fashion! While trying this out, I saw / remembered why I didn't do it in the past.... so for

[corona@Telie unittest_paths]$ tree
.
├── package
│   ├── __init__.py
│   ├── package.py
│   ├── sub1
│   │   ├── __init__.py
│   │   └── sub2
│   │       ├── __init__.py
│   │       └── test_subpkg.py
│   └── test_pkg.py
└── test_pkg.py
[corona@Telie unittest_paths]$ micropython
MicroPython v1.23.0-preview.254.gacbdbcd95e.dirty on 2024-06-06; linux [GCC 13.2.1] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> __import__("package.sub1.sub2.test_subpkg")
<module 'package' from 'package/__init__.py'>
>>>

Dunder-importing the "full dotted path" of a module returns you the base module object, not the final / sub package module you actually want! The thought of then having to iterate through each sub package getattr'ing each one seemed painfully tedious. Reviewing the cpython docs on __import__ reminded me of the other args agailable here:

__import__(name, globals=None, locals=None, fromlist=(), level=0) ... The fromlist gives the names of objects or submodules that should be imported from the module given by name ... When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name. However, when a non-empty fromlist argument is given, the module named by name is returned.

I still didn't find this particularly clear, however this gives what we need:

[corona@Telie unittest_paths]$ micropython
MicroPython v1.23.0-preview.254.gacbdbcd95e.dirty on 2024-06-06; linux [GCC 13.2.1] version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> __import__("package.sub1.sub2.test_subpkg", None, None, "package.sub1.sub2.test_subpkg")
<module 'package.sub1.sub2.test_subpkg' from 'package/sub1/sub2/test_subpkg.py'>
>>>

This now works with import path matching cpython behavior, with a unit test added for this subfolder package import behavior.

dpgeorge commented 1 minute ago

Nice work!