Closed dottspina closed 2 years ago
I'm not opposed to this change, but I'm a bit confused on what the DLLInfo
class is solving? Is it the name being prefixed by lib
? In my testing, I think I've found that ctypes
knows how to handle the leading lib
, but I may be mistaken. Outside of that, it seems like we could address the Linux-specific dl
handling through the .load()
function on the Library
class, which has a special case for handling Windows already.
That would also solve the unit tests failing because of expectations changed, though you may have other failures as well post-changes for Linux. A unit test for the specific case would also be appreciated.
@hkpeprah : Thanks for your reply.
I'm not opposed to this change, but I'm a bit confused on what the
DLLInfo
class is solving? Is it the name being prefixed bylib
?
There actually are two distinct literals involved:
ctypes.util.find_library()
function: according to the documentation, this is "the library name without any prefix like lib, suffix like .so, .dylib or version number"pylink.Library.find_library_{windows,linux,darwin}()
functions when scanning the directories where the JLink Software pack is expected to be installed (/opt/SEGGER/JLink
et cetera)On Windows, the return value of Library.get_appropriate_windows_sdk_name()
, 'JLinkARM'
or 'JLink_x64
, is used for both tasks, which works as expected: these values are both valid parameters for the search by library name in ctypes.util.find_library()
, and appropriate for the search by file name in Library.find_library_windows()
.
However, on Linux and MacOS:
jlinkarm
is the literal appropriate for the search by library namelibjlinkarm
is the literal appropriate for the search by file nameAdditionally, on Linux, ctypes.util.find_library()
will not return the library full file path, but only the file name, for e.g. libjlinkarm.so.7
: the class JLinkDllInfo
mostly implements how to retrieve the full file path in this later case (please see bellow).
Note that my initial patch missed that JLINK_SDK_NAME
was also used by the pylink.Library.find_library_{linux,darwin}()
functions (the search by file on Linux/MacOS): I've updated the PR to fix this.
In my testing, I think I've found that
ctypes
knows how to handle the leadinglib
, but I may be mistaken.
AFAICT, on Linux ctypes.util.find_library('libjlinkarm')
consistently fails (returns 'None
), where ctypes.util.find_library('jlinkarm')
will successfully resolve a J-Link library installed in LD_LIBRARY_PATH
.
I can't test for MacOS, and it's a non issue on Windows.
Outside of that, it seems like we could address the Linux-specific
dl
handling through the.load()
function on theLibrary
class,
It's indeed done in the Library.load_default()
function, though somewhat indirectly, incidentally breaking unit tests expectations.
which has a special case for handling Windows already.
The Library
class even implements three specials cases, find_library_{windows,linux,darwin}()
, but these are for the search by file name, i.e. when scanning the c:\\Program Files\SEGGER
, /opt/SEGGER
, or /Applications/SEGGER
directories.
There are no distinct code paths to handle the different semantics of the ctypes.util.find_library()
function's return value when searching by library name: the full file path on Window and MacOS, or only the file name (or so name) on Linux.
The JLinkDllInfo
class is mostly here to retrieve the full file path in this later case (again, please see bellow).
That would also solve the unit tests failing because of expectations changed, though you may have other failures as well post-changes for Linux.
If I understand correctly, some tests fail because now the function find_library()
is not called from the Library
class, but from JLinkDllInfo
, which is not mocked, and thus does not account for the mock_find_library.assert_called_once_with()
thing.
I did create a separated class thinking it was a better design:
JLinkDllInfo.path
property) Library
code with the Linux specific (and a bit ugly) dlinfo()
dance required to retrieve the full library file path based on its so nameBut I could try a refactoring to move the whole search by library name implementation to the Library
class, and see if that solves at least some of the Expected 'find_library' to be called once. Called 0 times.
.
That would also imply to replace jlinkarm
(JLINK_SDK_NAME
) by libjlinkarm
(JLINK_SDK_STARTS_WITH
, see commit ff640fa) where appropriate in test_library.py
.
[Edit: No, this is obviously wrong, find_library()
is consistently and rightly called with JLINK_SDK_NAME
]
There are also tests that fail with No such file or directory: 'C:\\'
: can I ignore (or skip) these on Linux, or should some magic mock_directories()
make them run just fine ?
A unit test for the specific case would also be appreciated.
I fully understand, and will try to come with something.
I may ask for some help, though: for e.g. I do not (yet) understand if the testing platform should actually host
the various versions of the J-Link library the unit tests seem to look for, or if the whole thing is mocked.
If the later is true: how do you mock, say on Windows, what the Linux or MacOS dynamic linker would have done via ctypes.util.find_library()
to resolve a library name ?
As I've said, though I'm willing to help, I haven't written a line of Python for years, Python2 was still widely use,
and have to read a bit more about this unittest.mock
libray (luckily I'm familiar with mocks): it'd be great if you could either give me a few hints, or direct me to a good pointer, something that would teach me what I need without too much overhead.
[Edit: Politeness may deserve I add two lines of background.
Other software used to fail loading the J-Link library from LD_LIBRARY_PATH
, for e.g. Nordic nrfjprog
command line tool did, but they've generally been fixed by now, and pyOCD, which relies on pylink, is the last tool I daily need that still fails: that's why I may appear a bit sensitive about this issue, please forgive me if I sounded rude by any mean.
(*) Once more today pyocd
was failing, until I understand I was using the upstream pylink
library pulled with the Zephyr SDK ;-)
]
[Edit2: the JLinkDllInfo
class was consistently misspelled JLinkarmDllInfo
This class has since been renamed to JLinkarmDlInfo
, see 45c8398.]
Thanks.
-- chris
The latest patch seems to help: AFAICT this PR would pass all unit tests (python3 setup.py test
), as if I even had a C:\\
directory on my Linux box now, and a dozen versions of the J-Link library.
The functional tests (python3 setup.py bddtest
) still fail short though: if I understand correctly I need some additional setup to run these.
Regarding the "unit test for the specific case", I can at least describe it.
The initial state to mock would be:
/opt/SEGGER
, let's say it's installed into $JLINK_INSTALL_DIR
/opt/SEGGER
may not even exist, I don't think it's relevantLD_LIBRARY_PATH
includes $JLINK_INSTALL_DIR
, that is the relevant partThen, we would expect that invoking the Library
constructor (and in turn Library.load_default()
) will successfully resolve (find and load) the JLink library, and as usual that ctypes.util.find_library()
has been called once from the Library
class.
We could also expect that ctypes.util.find_library()
has been called once from the JLinkDllInfo
class (to resolve the dl
library), which would prove that it's actually how we find the library in $LD_LIBRARY_PATH
, but that may sound a bit pedantic.
I'll try too implement something. [Edit: follow-up.
I'm rather stuck:
mock.patch.dict()
the os.environ
environment to include an ad-hoc value for LD_LIBRARY_PATH
, find_library()
(see ctypes.util._findLib_ld()
) will still fail if the library file does not actually exist (rightly relying on the native linker ld
, this will ignore the mocked file system)dlinfo()
dance's implementation since that needs to first load the library (using the so name returned by find_library()
) to then get its absolute file path (the so file)I'm afraid a test case relying on mock_directories()
to provide the J-Link library file would eventually prove itself useless, since it won't address:
ctypes.util.find_library()
API we rely on properly honor LD_LIBRARY_PATH
? I know it does today, but how do we test for regressions (see for e.g. ctypes find_library should search LD_LIBRARY_PATH on Linux #54207) ?ctypes.util.find_library()
with a valid parameter (the libjlinkarm
vs jlinkarm
thing), and get the expected so name ?dlinfo()
dance's implementation itself (JLinkarmDlInfo
)Thus, to summarize the status as of now:
LD_LIBRARY_PATH
; do I miss something obvious ?test_library.py
that covers the search by library name logic ? IMHO, they all rely on the search by file name logic: do I again miss something obvious ? Expected 'find_library' to be called once. Called 0 times.
?]
Thanks.
-- chris
Is there anything else I can do that would help fixing this ?
This thread, and the initial issue (#131), are quite old, and worth a brief summary:
/opt/SEGGER
, though appropriately included in LD_LIBRARY_PATH
ctypes.util.find_library()
('libjlinkarm'
where it should be 'jlinkarm'
, API reference) which will always fail (return None
)/Applications/SEGGER
, regardless of DYLD_LIBRARY_PATH
Fixing this is not as immediate as setting JLINK_SDK_NAME
to jlinkarm
:
ctypes.util.find_library('jlinkarm')
will not return the library's absolute file path, but its so name, for e.g. libjlinkarm.so.6
: retrieving the actual absolute path requires a little dlinfo()
dance, which is implemented in the new class JLinkarmDlInfo
JLINK_SDK_NAME
's value is now a valid parameter for ctypes.util.find_library()
, but will break the search by file name implementation in pylink.library.Library.find_library_{linux,darwin}()
, which expects a startswith('libjlinkarm')
pattern: the new constant Library.JLINK_SDK_STARTS_WITH
is introduced to hold this semanticThe original PR had a couple of issues:
'libjlinkarm'
literal (the JLINK_SDK_STARTS_WITH
semantic)ctypes.util.find_library()
from the Library
class to the new JLinkarmDlInfo
classThese issues were addressed by the commits bellow:
JLINK_SDK_NAME
in find_library_{linux,darwin}()
Regarding a new unit test that would cover the case addressed by this PR:
LD_LIBRARY_PATH
to install the JLink library without the (root
) permissions required to write under the /opt
directory ctypes.util.find_library()
implementation itself (ctypes find_library should search LD_LIBRARY_PATH on Linux), and above all our dlinfo()
dance integrationctypes
and the involved underlying native tool (dyld
, ldconfig
, ld
, objdump
, gcc
, etc, depending on the operating system and what's available) would behave naturallyAnd I can't figure out how to implement such a test case without introducing not trivial requirements (simply mocking a file system is not enough here):
libjlinkarm.so
file, that we'll be able to include in LD_LIBRARY_PATH
, and that the system's ldconfig
, ld
, objdump
, etc will be happy with (for e.g., on Linux, ctypes.util.find_library()
will first try something like ld -t -L $LD_LIBRARY_PATH -o /dev/null -ljlinkarm
) @hkpeprah, are you still missing something, beside time, to:
ctypes.util.find_library()
) Thanks.
-- chris
Hello @dottspina, I will take a look today. From a high level, I think we need unit tests that validate the new behaviour you're adding. You don't need to simulate the OS, but using the mock
library to simulate expected return values / failure scenarios would be a baseline IMO. So this would be:
dl
(should never happen, but we should test it)dl
and failing to load the DLLdl
and succeeding to load the DLLIn the failure cases, we should expect the default behaviour to take over.
To add the aforementioned tests, the current Linux find library tests are passing because this condition happens to be false since we mock
ctypes.cdll.LoadLibrary
:
if dlinfo(tmp_cdll_jlink._handle, JLinkarmDlInfo.RTLD_DI_LINKMAP, ctypes.byref(linkmap)) == 0:
So the result is probably some mock
object that leads us to the else
case. We should be explicitly testing the true
and false
cases, so the Linux tests that currently use the find_library()
need to be updated to explicitly return a non-zero entry for this, and the new test would return zero and handle testing DLINFO.
@hkpeprah, thank you for taking the time to work on this and your constructive comments.
To add the aforementioned tests, the current Linux find library tests are passing because this condition happens to be false since we
mock
ctypes.cdll.LoadLibrary
:if dlinfo(tmp_cdll_jlink._handle, JLinkarmDlInfo.RTLD_DI_LINKMAP, ctypes.byref(linkmap)) == 0:
So the result is probably some
mock
object that leads us to theelse
case. We should be explicitly testing thetrue
andfalse
cases, so the Linux tests that currently use thefind_library()
need to be updated to explicitly return a non-zero entry for this, and the new test would return zero and handle testing DLINFO.
I think the Linux tests (in tests/unit/test_library.py
) currently don't even reach this condition, precisely because find_library()
is mocked to return None
.
While I fully agree testing this code path (the JLinkarmDlInfo
class) will require the find_library()
mock to at least return something, I'm not sure to understand why we would update the existing Linux tests: to me they have always been designed to cover the search by file name code path, i.e. the Library.find_library_linux()
function, which happens when ctypes.util.find_library()
has failed. IMHO their semantic (use cases, coverage, expected behaviors) is orthogonal to this PR, and we should not touch their implementation (in the context of this PR).
OTOH, I've started adding a couple of unit tests, mostly to cover error conditions or confirm a few code path, as you've previously suggested. I hope I'll soon be able to push something we can build upon.
Thanks.
-- chris
* Failing to find `dl` (should never happen, but we should test it)
Agreed, if a system presents itself as POSIX (Linux), it should provide the dl
library, but nonetheless we have to know how the code would behave. I've added a unit test checking this behavior.
Note that while dlopen()
and friends are POSIX, dlinfo()
is a GNU extension, defined only on systems using the GNU libc. I've updated the PR to not assume all Linux hosts use glibc
, and added a unit test to confirm the code behavior.
* Finding `dl` and failing to load the DLL
I assume here DLL refers to the dl
library (not the "JLink DLL").
I can add a test where LoadLibrary()
's mock raises an OSError
when called to load dl
: according to our previous discussion, the test will have to confirm the exception actually propagates from the ctypes API to the call site.
* Finding `dl` and succeeding to load the DLL
I again assume DLL refers to the dl
library.
IIUC, LoadLibrary()
's mock would answer a mock dl
library, tmp_cdll_dl
in the code bellow:
dl_soname = ctypes_util.find_library('dl')
if dl_soname is not None:
tmp_cdll_dl = ctypes.cdll.LoadLibrary(dl_soname)
dlinfo = tmp_cdll_dl.dlinfo
dlinfo.argtypes = ctypes.c_void_p, ctypes.c_int, ctypes.c_void_p
dlinfo.restype = ctypes.c_int
linkmap = ctypes.c_void_p()
if dlinfo(tmp_cdll_jlink._handle, JLinkarmDlInfo.RTLD_DI_LINKMAP, ctypes.byref(linkmap)) == 0:
linkmap = ctypes.cast(linkmap, ctypes.POINTER(JLinkarmDlInfo.LinkMap))
self._dll_path = linkmap.contents.l_name.decode(sys.getdefaultencoding())
And we could indeed write a test to confirm we actually execute the last two lines only when dlinfo()
returns 0
, but I confess that sounds a bit far-fetched to me: the if
condition happens at the line just above, that contains an atomic (wrt Python) function call, there's so much locality that IMHO there remains no code path to test.
In the failure cases, we should expect the default behaviour to take over.
I assume "default behaviour" refers to falling back to the search by file name code path (Library.find_library_linux()
on Linux) after the search by library name has failed (ctypes find_library()
returns None
).
Whether we actually execute this fallback depends on how the search by library name fails:
find_library('jlinkarm')
call fails (returns None
), we directly proceed with the fallback (as we do for all platforms)find_library('jlinkarm')
succeeds (returns a soname), but the host system is not Linux with GNU libc, we also directly proceed with the fallback find_library('jlinkarm')
succeeds, the host system presents itself as Linux with GNU libc, but find_library('dl')
returns None
, we skip the dlinfo()
dance and proceed with the fallback (though I'm dubious about our chances to eventually dlopen()
anything without dl
)find_library('jlinkarm')
succeeds, we successfully load dl
, but the call to dlinfo()
fails (does not return 0
), we drop the found soname and continue to the fallbackLoadLibrary()
raises an OSError
when loading the jlinkarm
or dl
shared object, we let the exception propagate and the fallback won't executeAccording to our previous discussion, I think it's fine to not execute the fallback in this later case.
Adding to our thoughts about updating or not the pre-existing Linux unit tests in test_library.py
: note that they all assume the initial call to find_library()
fails, while the tests I add all assume find_library()
does succeed, hence I feel they actually complement each others (cover distinct code paths).
Thanks.
-- chris
Adding to our thoughts about updating or not the pre-existing Linux unit tests in test_library.py: note that they all assume the initial call to find_library() fails, while the tests I add all assume find_library() does succeed, hence I feel they actually complement each others (cover distinct code paths).
I looked again, and you are correct.
@hkpeprah , thanks again for taking the time to review this PR.
Thanks for going through with the tests. Sorry for all the back and forth on this.
No hassle. Understanding the existing unit tests, why they pass or fail, and adding a small handful of new ones, is actually what gives me some confidence that this PR should not introduce obvious regressions. Taking care of this naturally also falls on the shoulders of the submitter.
At the end of the day, that back and forth makes for a cleaner patch.
This looks good to me to merge and un-block Linux platforms.
Nice, this should also fix pylink's downstream software (pyOCD, Zephyr-RTOS, Nordic pynrfjprog, etc) on Linux. Additionally, this may improve the situation on macOS: would you mind sharing such feedback, if any ?
Before I can commit the change, you will need to sign the Square CLA though. It is available here: https://spreadsheets.google.com/spreadsheet/viewform?formkey=dDViT2xzUHAwRkI3X3k5Z0lQM091OGc6MQ&ndplr=1
Done.
I can then merge this change and make a new release later today.
Great.
To thank you for your work on a somewhat niche issue, I'd like to let you know about a confusing use case this release will fix.
On Linux, when pyOCD fails to load the JLink library, it does so silently, which is right since it does not require this library (neither does it expect a JLink hardware), but pyocd --list
will then simply report "No available debug probes are connected": you, as a user, will then dig into hardware or firmware issues, missing or incorrect udev rules, etc, whereas the software has just not honored LD_LIBRARY_PATH
.
Thanks.
-- chris
Changes should be available in v0.14.0
. Thanks for the patch.
I was plainly wrong, we should fix this here, not downstream (see #138 and #139).
On Linux, the Library.load_default() function fails to properly resolve the JLink shared library if not installed in /opt/SEGGER/JLink, regardless of the value of LD_LIBRARY_PATH.
This is possibly caused by an improper use of the ctypes find_library() API, which:
Thus:
Changes introduced by this patch:
If I'm correct, this should not break anything on Windows, and improve things on both Linux and MacOS.