matthew-brett / delocate

Find and copy needed dynamic libraries into python wheels
BSD 2-Clause "Simplified" License
262 stars 59 forks source link

`delocate-wheel` fails on UnicodeDecodeError #181

Closed brenthuisman closed 1 year ago

brenthuisman commented 1 year ago

Description

See title. The command (delocate-wheel) errors out with UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 396537: invalid continuation byte, see log.

There have been no changes to this package recently, so it might be an environmental thing, but the Github virtual env changelog doesn't show an obvious change. Any ideas?

Build log

https://github.com/arbor-sim/arbor/actions/runs/4904892311/jobs/8758317425#step:7:10823

CI config

https://github.com/arbor-sim/arbor/actions/runs/4904892311/workflow

HexDecimal commented 1 year ago

I'd guess that MacOS's unzip is behaving badly. It isn't outputting in the locale set by Python for some reason. I'm going to assume that unzip writes filenames correctly despite this. I'd note that from the CI the same repair seems to work okay on Ubuntu and I recall having issues with MacOS zipping tools before as if they're outdated on MacOS compared to other OSes.

Could add -q to the unzip command which will suppress the output of filenames, or tell Python to replace or escape malformed characters. The output here is only saved for debugging a failing command.

HexDecimal commented 1 year ago

@brenthuisman would you mind testing PR #182 in case Unicode decode issues show up anywhere else?

brenthuisman commented 1 year ago

A colleague sounded the all clear! Thanks!

jvolkman commented 1 year ago

Maybe consider borrowing auditwheel's implementation which uses zipfile but also preserves file permissions. This would remove the dependency on the external unzip.

https://github.com/pypa/auditwheel/blob/321292d620f14ddc7ad3247d71746679d6e51558/src/auditwheel/tools.py#L31-L52