linkedin / shiv

shiv is a command line utility for building fully self contained Python zipapps as outlined in PEP 441, but with all their dependencies included.
BSD 2-Clause "Simplified" License
1.75k stars 99 forks source link

different zipapps sharing the same .lock file during extraction due to dots in the filename #203

Open edo248 opened 2 years ago

edo248 commented 2 years ago

Hello

I've run into an issue where different shiv zipapp binaries are using the same .lock file during extraction.

We don't use any extensions for binaries compiled with shiv (these are Linux-only). Binary filenames would be app-1.2.3 or app-1.2.5. Now, because extraction code uses pathlib.Path().stem in a few places, for two different versions of the binary app-1.2.3 and app-1.2.5 their lock file would be the same app-1.lock.

The reason for this is that when calculating target_path first .stem of the file is taken. For "app-1.2.5" this would mean target path is "~/.shiv/app-1.2_". This itself is not a problem, since build_id should be unique enough. But I think it's already unexpected (stripping out an extension would be fine of course).

https://github.com/linkedin/shiv/blob/1.0.0/src/shiv/bootstrap/__init__.py#L106-L107

def cache_path(...):
    ...
    name = Path(archive.filename).resolve().stem
    return root / f"{name}_{build_id}"

Later when extraction starts there's another stripping happening, temp dir and lock file are generated using .stem as well (see below). So lock file for app-1.2.5 becomes app-1.lock (which is the same lock file for app-1.2.3).

https://github.com/linkedin/shiv/blob/1.0.0/src/shiv/bootstrap/__init__.py#L120-L121

def extract_site_packages(...):
    ...
    target_path_tmp = Path(parent, target_path.stem + ".tmp")
    lock = Path(parent, f".{target_path.stem}_lock")

For us, this issue was coupled with using an old version of shiv in a few places. The older version of shiv had the another bug of deleting the lock file (fixed in https://github.com/linkedin/shiv/commit/c776ea413e260dcbe150cc600d1d3f468446fdd0). So the binary with older shiv version was deleting locks of other apps because of shared name.

I am assuming this difference between lock file and destination directory is not expected. This is a side effect of having a version with dots in the filename. Which I wouldn't think is unusual. Even if a file would have an extension (e.g. app-1.2.3.exe), the lock file would still be app-1.2.lock, clashing with other binaries with different patch versions.

I think in both cases it should be enough to replace .stem with .name, since all supported OSes should easily allow multiple dots in directory/filenames. Here's how a simple change from .stem to .name would look https://github.com/edo248/shiv/commit/adeb4b583c99648cba5e449c11ad860a586f17e4. Alternatively, code can escape dots with underscores, for example.

Wanted to hear your opinion on the preferred solution.

Thanks Eduard

lorencarvalho commented 2 years ago

Hi @edo248, great catch and thanks so much for raising this issue! This is a nasty bug, I'm sorry if it caused you some headaches.

Since you already fixed it, I've gone ahead and cherrypicked your commit into #207 and will release it shortly. I'm planning to leave this issue open though since I think it'd be good to have a regression test for this case.

edo248 commented 2 years ago

Many thanks @lorencarvalho !