Open bukzor opened 9 years ago
is there a good package that helps finding those paths and helps clearing unused caches
after all if the cache folder is outside of the working directory, its lifetime is unrelated to the working directory
imagine for example a unaware ci system that makes one folder per build could easily DOS the ci server due to filling up ~/.cache of the ci user
with the current mode of operation, cache and working directory are simply co-located and share a lifetime, so data management is easy, the cache is gone when one deletes the working directory
im not per se opposed to following XDG, but it makes lifetimes of file-system objects more tricky on our case and seems not really available on windows either (its my personal impression that xdg works only well on posix boxes with a typical linux/bsd setup)
Those are good points...
I don't have a good answer. The applications I've seen don't worry about cleaning up like that. It would work for me if we use the current behavior when XDG_CACHE_DIR doesn't exist.
@asottile
This one at least helps find a good platform-specific cache dir: https://pypi.python.org/pypi/appdirs/1.4.0
This one helps manage expiry for a file-based cache. Making one of the keys be the working directory would make it work.
i shorty investigated appdirs and pyfscache, unfortunately both are unsuitable
appdirs seems bug-ridden pyfscache is for something entirely different not fitting the use-case
how about introducing a ini value/env variable to select the cache backend, which defaults to worktree, but one can choose xdg/disabled?
@hpk42 oppinions?
Personally I'd rather default to xdg instead of needing to add .cache to the gitignore in every one of my projects
while i agree that it is a nice default, we cant do a change like that befor 3.0
also before we can do it propperly we need to come up with a solid way to clean things up
Maybe introducing the cache plugin defaulted to on should have been in 3.0 too :/
since its a normal feature addition, it doesnt warrant a release that large
i hate ".cache" folder for aesthetic reason. my projects folders are already populated by ide, git, python and others "support folders". sometimes i have more folders than actually python program files! please make ".cache" location configurable!
how about introducing a ini value/env variable to select the cache backend
Anyone is against this? While it may not be the best option, it is simple to implement and backward compatible.
It should support variable expansion and default to .cache
for backward compatibility. This would allow users to change it like this:
[pytest]
cache_dir=$XDG_CACHE_DIR
Or use an environment variable:
PYTEST_CACHE_DIR=/some/path
I'm against the most easy variant, I want a simple solution
@RonnyPfannschmidt I've found that the function you'd need exists in pip.utils:
https://github.com/pypa/pip/blob/develop/pip/utils/appdirs.py#L13
How about we add a configuration option that decides between the two systems? cache_location
, which can be local
(<rootdir>/.cache
, the default) or system
(using the heuristics suggested by @bukzor)? This could be added in 2.9
.
If we make a poll and people wants to change system
as the default, we can announce that the default will change in 2.10
, giving plenty of time for people to change to their desired setting while in 2.9
and not get caught off guard.
BTW, I'm just throwing some ideas so we can reach some consensus, I don't mind having the .cache
directory in pytest's rootdir: I configured my git-global-ignore file to always ignore it.
:+1:
For the implementation of system
, I'd probably just import and use
pip.utils.
Would you all be opposed to a dependency on pip from pytest?
I imagine ~100% of python installations have pip installed already.
On Wed, Dec 23, 2015 at 10:29 PM Ronny Pfannschmidt < notifications@github.com> wrote:
[image: :+1:]
— Reply to this email directly or view it on GitHub https://github.com/pytest-dev/pytest/issues/1089#issuecomment-167052256.
Actually, the pip function seems to be closely related to the appdirs package implementation. I'm not sure which is derived from which though.
https://github.com/ActiveState/appdirs/blob/master/appdirs.py#L257
IMHO I would just copy the function over, licensing permitting... depend on the entire pip
library just for the sake of a small function like that is not worth it, I think, as I'm not sure how stable is that API.
I'd go for publishing a minimalistic lib that implements it
Hi, Is the recommended solution to add .cache to every project's .gitignore? Or is there a consensus on a solution for this issue? Thanks, A.
@antoche for now yes
There's a cache_dir
ini option which is now available in 3.2
.
Should we change the default of that option to $XDG_CONFIG_DIR
(if defined) in future pytest versions?
the config dir is under all circumstances always the wrong folder for the cache
we should use the cache home as according to the spec,
and we should take a look at using the appdirs lib to facilitate those details
@nicoddemus: No, the default should be $XDG_CACHE_DIR, which in turn has a default of $HOME/.cache.
$XDG_CACHE_HOME defines the base directory relative to which user specific non-essential data files should be stored. If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.
If you want to support macos and windows in a natural way, you'll want a library for this. I think this is the right solution: https://github.com/ActiveState/appdirs#the-problem
@bukzor thanks for the clarification and correcting my mistake
Thanks @bukzor.
Hmm just realized that with this change the cache will no longer be per-repository, but will be global per-user. This might be a problem because plugins assume a per-repository cache, for example pytest's --lf
option.
@nicoddemus borgbackup for example uses the repository path as extra path component into the per user caches in order to keep things appart
I've used that strategy myself with success, but I only care about Unixen. You'll want a proof of concept on Windows, but I don't see why it wouldn't work.
On Sun, Aug 6, 2017, 09:09 Ronny Pfannschmidt notifications@github.com wrote:
@nicoddemus https://github.com/nicoddemus borgbackup for example uses the repository path as extra path component into the per user caches in order to keep things appart
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pytest-dev/pytest/issues/1089#issuecomment-320516073, or mute the thread https://github.com/notifications/unsubscribe-auth/AAnFSIZ0E-r_YIMrnHaUZRiQBLR5Bl4Oks5sVeUagaJpZM4GF3LC .
What should it default then to? $XDG_CACHE_DIR/pytest/$pytest_rootdir
?
A hash oft the path element iis required ffor sanity
Ok.
Not really a fan of this myself though - I'd rather keep it with the repository, similar to .tox
dirs.
Only came here via #4270.
Using a hash would not allow to prune only certain/known dirs for example.
borg appears to use a hash (but you have usually less borg caches than pytest ones).
However, the CACHEDIR.TAG
file (also used by borg) would be good to have.. (separate issue; haven't searched).
@blueyed we can make accompanying json files to make accompanying metadata and a prune command
the CACHEDIR.TAG
should totally be added
in fact we ought to consider adding it even to the .pytest_cache folder like the readme/gitignore
This is now possible because cache_dir
ini option accepts environment variables:
[pytest]
cache_dir=$XDG_CONFIG_DIR
So I believe this can be closed.
In xdg those have implied defaults, and the example you gave would nuke the entire config dir of the user if it was explicitly set while running cache clean
Oh you mean it should have been cache_dir=$XDG_CONFIG_DIR/pytest-cache
? I see, bad assumption.
But other than that, what else would a user need to use the $XDG_CONFIG_DIR
?
Again incorrect, there are implicit defaults in the standard, and the config is NEVER to be used for caches
Additionally there would be confusion about ownership if entirely different projects used the same setup
the config is NEVER to be used for caches
I'm confused... is this issue invalid then?
Additionally there would be confusion about ownership if entirely different projects used the same setup
Oh definitely I was naive on my quick late night reply.
The initial comments had the correct var name, the title was wrong, i fixed that
Can we please fix this? It should no be very hard to do the right thing as most of other python tools do that already: cpython, mypy, pip, pylint, pre-commit are just some examples of tools going to ~/.cache
.
TBH, I am surprised that after so many years pytest still refuses to use the official cache directories and pollutes our directory trees with .pytest_cache
folders.
os.getenv("XDG_CACHE_HOME", os.path.expanduser("~/.cache")
AFAIK it is usually a one-liner like https://github.com/ansible-community/ansible-lint/blob/08fae7af24de64c1ed4d583765728c21b38643f4/src/ansiblelint/__main__.py#L112
The example above is using a hash key to isolate the cache directories between projects, based on their folder. It is simple but effective.
The example above is using a hash key to isolate the cache directories between projects, based on their folder. It is simple but effective.
Ahh right, good idea.
I don’t think that CPython uses XDG directories for anything, do you have other info @ssbarnea ?
@ssbarnea feel free to invest the time to create a proper storage /eviction scheme, which even the named tools only partially implement
It almost does, PYTHONPYCACHEPREFIX=~/.cache/cpython/
is fixing it.
@ssbarnea this is about pytest cache, which is different from byte code cache
You need at least one cache per test invocation root, but potentially more than one for virtualenvs/python versions
And you need to be able to cleanup potentially independently of the existing folders
I see two possible approaches here:
.cache
model similar to https://dot-config.github.io/ - basically changing default to be .cache/pytest_cache
should be enough for that. At least we avoid being another software the clutters our project root directory with temp stuff.pytest is the only tool in my python stack which still pollutes my source directories. python
, poetry
, pipenv
, etc all use ~/.cache/...
by default or allow me to configure conveniently via env var.
Forked from #1029
a) A very convenient, and standard, way to configure tools' cache location is to set $XDG_CACHE_DIR. Many of my project's fixtures do this for other reasons already. Similarly, the most convenient, and standard, way to set "basetmp" would be $TMPDIR.
b) "something a CI system can easily pick up" is exactly these environment variables. Many systems will already support this, and those that don't will support injecting environment variables.
c) "non-containered ci" will either support ~/.cache correctly, or set $XDG_CACHE_DIR, because this is necessary for other tools that use the standard. Again, this is already handled in many of my projects because of other tools.