paylogic / pip-accel

pip-accel: Accelerator for pip, the Python package manager
https://pypi.python.org/pypi/pip-accel
MIT License
308 stars 35 forks source link

Windows / Appveyor - same version of cached file - invalidated and built again #62

Closed matysek closed 8 years ago

matysek commented 8 years ago

I am using pip-accel with appveyor in my project and pip-accel invalidates even the same version of a package that is already cached:

pip_accel.bdist    INFO Invalidating old numpy (1.10.1) binary (source is newer) ..
pip_accel.bdist    INFO Building numpy (1.10.1) binary distribution ..

Url for appveyor build: https://ci.appveyor.com/project/matysek/pyinstaller-ewwmo/build/181/job/6aj9u6f8lsfbpej9

In the build appveyor is instructed to cache these paths:

C:\Users\appveyor\AppData\Local\pip
C:\Users\appveyor\AppData\Roaming\.pip-accel
matysek commented 8 years ago

Shouldn't be the cached file invalidated only when the version of a packages is newer?

xolox commented 8 years ago

Shouldn't be the cached file invalidated only when the version of a packages is newer?

Before I added that invalidation I actually expected the same thing, so your assumption is very reasonable! However a lot of people run CI builds on all changes to a repository, when no version bump has been performed in a Python package under source control, so unfortunately this is not a good solution (personally I avoid these problems by always bumping versions, if need be based on a "local revision number" from a version control repository).

Ideally I can get pip-accel to "do the right thing" in all situations, but I'm not yet sure how that should work. Several ideas come to mind though:

  1. Get the "last modified time" from the contents of the source and binary distributions. If possible this should be fairly simple to implement.
  2. Keep a map of SHA1 digests of all source and binary distributions and base the cache invalidation on a digest that changes. This is more work but the result would be platform independent.
  3. Add a configuration option that basically tells pip-accel to ignore last modified times and trust the user that version numbers will be bumped.

Without further input I'll probably investigate option one and fall back to option two if one doesn't work out (I don't like opt-in configuration options and would rather have the defaults work for the majority of users).

matysek commented 8 years ago
matysek commented 8 years ago

@xolox I think for the majority of users would work this scenario:

  1. use SHA1 digest if it was created and is available for cached items
  2. fallback to "last modified time"
xolox commented 8 years ago

Hi Martin,

Because pip-accel's default cache invalidation scheme has always worked fine it seemed wasteful to unconditionally switch to the alternate scheme of calculating SHA1 hashes all the time (for what is essentially a quirk of AppVeyor CI, not Windows). Instead I've decided to make the cache invalidation scheme configurable.

In the newest version of pip-accel (0.36.2) the alternate cache invalidation scheme can be enabled by setting trust_mod_times to False (this is the default on AppVeyor CI).

I hope this helps to avoid useless rebuilds. If you encounter problems feel free to reopen this issue or open a new one.