Open lkollar opened 2 years ago
Another side effect of generating the hash before grafting is that it is impossible to validate the hash, if one wanted to make sure no modifications have been done to the grafted library.
This is actually causes an issue in the code snippet I pasted above: the code tries to avoid patching an already patched library, but it does so based on the generated hash. The else
on line 143 will never be reached because the hash is computed before we make the modification. The next time auditwheel runs, the hash will be different, since we've modified the binary.
As we now use the hash as the SONAME, which is embedded in the file, there is no way to perform the hashing after the modification. If we want to avoid modifying a library we've already patched before, we could add some information in the NOTE section maybe. But given that it doesn't seem that this feature works properly, maybe it's not worth fixing it and we could just switch to different way to make the SONAME unique.
The
numpy==1.18.4
andscipy==1.4.1
libraries both contain thelibopenblasp-r0-34a18dc3.3.7.so
file, as they most likely used the same version when invokingauditwheel
. The hashes match, but the files differ in contents:Most likely this is happening because we compute the hash before performing the patching on the files and using a different version of
patchelf
can produce slightly differenet .so files:https://github.com/pypa/auditwheel/blob/ef280f27759ceaf9acd84e606ef4bfc73d5b0a64/src/auditwheel/repair.py#L136-L150
Given that the resulting files are practically identical, they should not cause any issues, even if for some reason the loader decides to reuse the cached version, but I think we should generate the hash after modifying the file.
Another side effect of generating the hash before grafting is that it is impossible to validate the hash, if one wanted to make sure no modifications have been done to the grafted library.