pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
12.09k stars 2.68k forks source link

configure .cache via $XDG_CACHE_DIR #1089

Open bukzor opened 9 years ago

bukzor commented 9 years ago

Forked from #1029


a) A very convenient, and standard, way to configure tools' cache location is to set $XDG_CACHE_DIR. Many of my project's fixtures do this for other reasons already. Similarly, the most convenient, and standard, way to set "basetmp" would be $TMPDIR.

b) "something a CI system can easily pick up" is exactly these environment variables. Many systems will already support this, and those that don't will support injecting environment variables.

c) "non-containered ci" will either support ~/.cache correctly, or set $XDG_CACHE_DIR, because this is necessary for other tools that use the standard. Again, this is already handled in many of my projects because of other tools.

RonnyPfannschmidt commented 9 years ago

is there a good package that helps finding those paths and helps clearing unused caches

after all if the cache folder is outside of the working directory, its lifetime is unrelated to the working directory

imagine for example a unaware ci system that makes one folder per build could easily DOS the ci server due to filling up ~/.cache of the ci user

with the current mode of operation, cache and working directory are simply co-located and share a lifetime, so data management is easy, the cache is gone when one deletes the working directory

im not per se opposed to following XDG, but it makes lifetimes of file-system objects more tricky on our case and seems not really available on windows either (its my personal impression that xdg works only well on posix boxes with a typical linux/bsd setup)

bukzor commented 9 years ago

Those are good points...

I don't have a good answer. The applications I've seen don't worry about cleaning up like that. It would work for me if we use the current behavior when XDG_CACHE_DIR doesn't exist.

@asottile

bukzor commented 9 years ago

This one at least helps find a good platform-specific cache dir: https://pypi.python.org/pypi/appdirs/1.4.0

bukzor commented 9 years ago

This one helps manage expiry for a file-based cache. Making one of the keys be the working directory would make it work.

http://pythonhosted.org/pyfscache/

RonnyPfannschmidt commented 9 years ago

i shorty investigated appdirs and pyfscache, unfortunately both are unsuitable

appdirs seems bug-ridden pyfscache is for something entirely different not fitting the use-case

RonnyPfannschmidt commented 9 years ago

how about introducing a ini value/env variable to select the cache backend, which defaults to worktree, but one can choose xdg/disabled?

@hpk42 oppinions?

asottile commented 9 years ago

Personally I'd rather default to xdg instead of needing to add .cache to the gitignore in every one of my projects

RonnyPfannschmidt commented 9 years ago

while i agree that it is a nice default, we cant do a change like that befor 3.0

also before we can do it propperly we need to come up with a solid way to clean things up

asottile commented 9 years ago

Maybe introducing the cache plugin defaulted to on should have been in 3.0 too :/

RonnyPfannschmidt commented 9 years ago

since its a normal feature addition, it doesnt warrant a release that large

s-trooper commented 8 years ago

i hate ".cache" folder for aesthetic reason. my projects folders are already populated by ide, git, python and others "support folders". sometimes i have more folders than actually python program files! please make ".cache" location configurable!

nicoddemus commented 8 years ago

how about introducing a ini value/env variable to select the cache backend

Anyone is against this? While it may not be the best option, it is simple to implement and backward compatible.

It should support variable expansion and default to .cache for backward compatibility. This would allow users to change it like this:

[pytest]
cache_dir=$XDG_CACHE_DIR

Or use an environment variable:

PYTEST_CACHE_DIR=/some/path
RonnyPfannschmidt commented 8 years ago

I'm against the most easy variant, I want a simple solution

bukzor commented 8 years ago

@RonnyPfannschmidt I've found that the function you'd need exists in pip.utils:

https://github.com/pypa/pip/blob/develop/pip/utils/appdirs.py#L13

nicoddemus commented 8 years ago

How about we add a configuration option that decides between the two systems? cache_location, which can be local (<rootdir>/.cache, the default) or system (using the heuristics suggested by @bukzor)? This could be added in 2.9.

If we make a poll and people wants to change system as the default, we can announce that the default will change in 2.10, giving plenty of time for people to change to their desired setting while in 2.9 and not get caught off guard.

BTW, I'm just throwing some ideas so we can reach some consensus, I don't mind having the .cache directory in pytest's rootdir: I configured my git-global-ignore file to always ignore it.

RonnyPfannschmidt commented 8 years ago

:+1:

bukzor commented 8 years ago

For the implementation of system, I'd probably just import and use pip.utils. Would you all be opposed to a dependency on pip from pytest? I imagine ~100% of python installations have pip installed already.

On Wed, Dec 23, 2015 at 10:29 PM Ronny Pfannschmidt < notifications@github.com> wrote:

[image: :+1:]

— Reply to this email directly or view it on GitHub https://github.com/pytest-dev/pytest/issues/1089#issuecomment-167052256.

bukzor commented 8 years ago

Actually, the pip function seems to be closely related to the appdirs package implementation. I'm not sure which is derived from which though.

https://github.com/ActiveState/appdirs/blob/master/appdirs.py#L257

nicoddemus commented 8 years ago

IMHO I would just copy the function over, licensing permitting... depend on the entire pip library just for the sake of a small function like that is not worth it, I think, as I'm not sure how stable is that API.

RonnyPfannschmidt commented 8 years ago

I'd go for publishing a minimalistic lib that implements it

antoche commented 7 years ago

Hi, Is the recommended solution to add .cache to every project's .gitignore? Or is there a consensus on a solution for this issue? Thanks, A.

RonnyPfannschmidt commented 7 years ago

@antoche for now yes

nicoddemus commented 7 years ago

There's a cache_dir ini option which is now available in 3.2.

Should we change the default of that option to $XDG_CONFIG_DIR (if defined) in future pytest versions?

RonnyPfannschmidt commented 7 years ago

the config dir is under all circumstances always the wrong folder for the cache

we should use the cache home as according to the spec,

and we should take a look at using the appdirs lib to facilitate those details

bukzor commented 7 years ago

@nicoddemus: No, the default should be $XDG_CACHE_DIR, which in turn has a default of $HOME/.cache.

$XDG_CACHE_HOME defines the base directory relative to which user specific non-essential data files should be stored. If $XDG_CACHE_HOME is either not set or empty, a default equal to $HOME/.cache should be used.

If you want to support macos and windows in a natural way, you'll want a library for this. I think this is the right solution: https://github.com/ActiveState/appdirs#the-problem

RonnyPfannschmidt commented 7 years ago

@bukzor thanks for the clarification and correcting my mistake

nicoddemus commented 7 years ago

Thanks @bukzor.

Hmm just realized that with this change the cache will no longer be per-repository, but will be global per-user. This might be a problem because plugins assume a per-repository cache, for example pytest's --lf option.

RonnyPfannschmidt commented 7 years ago

@nicoddemus borgbackup for example uses the repository path as extra path component into the per user caches in order to keep things appart

bukzor commented 7 years ago

I've used that strategy myself with success, but I only care about Unixen. You'll want a proof of concept on Windows, but I don't see why it wouldn't work.

On Sun, Aug 6, 2017, 09:09 Ronny Pfannschmidt notifications@github.com wrote:

@nicoddemus https://github.com/nicoddemus borgbackup for example uses the repository path as extra path component into the per user caches in order to keep things appart

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pytest-dev/pytest/issues/1089#issuecomment-320516073, or mute the thread https://github.com/notifications/unsubscribe-auth/AAnFSIZ0E-r_YIMrnHaUZRiQBLR5Bl4Oks5sVeUagaJpZM4GF3LC .

blueyed commented 6 years ago

What should it default then to? $XDG_CACHE_DIR/pytest/$pytest_rootdir ?

RonnyPfannschmidt commented 6 years ago

A hash oft the path element iis required ffor sanity

blueyed commented 6 years ago

Ok.

Not really a fan of this myself though - I'd rather keep it with the repository, similar to .tox dirs. Only came here via #4270.

Using a hash would not allow to prune only certain/known dirs for example. borg appears to use a hash (but you have usually less borg caches than pytest ones). However, the CACHEDIR.TAG file (also used by borg) would be good to have.. (separate issue; haven't searched).

RonnyPfannschmidt commented 6 years ago

@blueyed we can make accompanying json files to make accompanying metadata and a prune command

the CACHEDIR.TAG should totally be added

in fact we ought to consider adding it even to the .pytest_cache folder like the readme/gitignore

nicoddemus commented 4 years ago

This is now possible because cache_dir ini option accepts environment variables:

[pytest]
cache_dir=$XDG_CONFIG_DIR

So I believe this can be closed.

RonnyPfannschmidt commented 4 years ago

In xdg those have implied defaults, and the example you gave would nuke the entire config dir of the user if it was explicitly set while running cache clean

nicoddemus commented 4 years ago

Oh you mean it should have been cache_dir=$XDG_CONFIG_DIR/pytest-cache? I see, bad assumption.

But other than that, what else would a user need to use the $XDG_CONFIG_DIR?

RonnyPfannschmidt commented 4 years ago

Again incorrect, there are implicit defaults in the standard, and the config is NEVER to be used for caches

Additionally there would be confusion about ownership if entirely different projects used the same setup

nicoddemus commented 4 years ago

the config is NEVER to be used for caches

I'm confused... is this issue invalid then?

Additionally there would be confusion about ownership if entirely different projects used the same setup

Oh definitely I was naive on my quick late night reply.

RonnyPfannschmidt commented 4 years ago

The initial comments had the correct var name, the title was wrong, i fixed that

ssbarnea commented 2 years ago

Can we please fix this? It should no be very hard to do the right thing as most of other python tools do that already: cpython, mypy, pip, pylint, pre-commit are just some examples of tools going to ~/.cache.

TBH, I am surprised that after so many years pytest still refuses to use the official cache directories and pollutes our directory trees with .pytest_cache folders.

os.getenv("XDG_CACHE_HOME", os.path.expanduser("~/.cache")

AFAIK it is usually a one-liner like https://github.com/ansible-community/ansible-lint/blob/08fae7af24de64c1ed4d583765728c21b38643f4/src/ansiblelint/__main__.py#L112

The example above is using a hash key to isolate the cache directories between projects, based on their folder. It is simple but effective.

nicoddemus commented 2 years ago

The example above is using a hash key to isolate the cache directories between projects, based on their folder. It is simple but effective.

Ahh right, good idea.

merwok commented 2 years ago

I don’t think that CPython uses XDG directories for anything, do you have other info @ssbarnea ?

RonnyPfannschmidt commented 2 years ago

@ssbarnea feel free to invest the time to create a proper storage /eviction scheme, which even the named tools only partially implement

ssbarnea commented 2 years ago

It almost does, PYTHONPYCACHEPREFIX=~/.cache/cpython/ is fixing it.

RonnyPfannschmidt commented 2 years ago

@ssbarnea this is about pytest cache, which is different from byte code cache

You need at least one cache per test invocation root, but potentially more than one for virtualenvs/python versions

And you need to be able to cleanup potentially independently of the existing folders

ssbarnea commented 2 years ago

I see two possible approaches here:

feluxe commented 11 months ago

pytest is the only tool in my python stack which still pollutes my source directories. python, poetry, pipenv, etc all use ~/.cache/... by default or allow me to configure conveniently via env var.