Closed bdklahn closed 5 months ago
Hi Brian, Thanks for reporting this, which definitely should be documented.
If you set the environment variable XDG_DATA_HOME
to point to your custom shared location, user_data_dir()
should notice and use it instead of the OS-specific default path. A subdir named hostile
will be created containing downloaded indexes. I've just verified this, and it seems to work on Linux, but not MacOS (?!), meaning I might need to report an issue to the platformdirs maintainer. But unless using MacOS, this should work. Please let me know how you get on.
The other option is to download the indexes and directly reference them using paths supplied to --index
, which is more clunky.
Hi Brian, Thanks for reporting this, which definitely should be documented.
If you set the environment variable
XDG_DATA_HOME
to point to your custom shared location,user_data_dir()
should notice and use it instead of the OS-specific default path. A subdir namedhostile
will be created containing downloaded indexes. I've just verified this, and it seems to work on Linux, but not MacOS (?!), meaning I might need to report an issue to the platformdirs maintainer. But unless using MacOS, this should work. Please let me know how you get on.The other option is to download the indexes and directly reference them using paths supplied to
--index
, which is more clunky.
Hi Bede,
Thanks for the response. My first thought was to maybe set via os.environ (or export . . . before running). But then I thought, since I've been testing/trying via the Python API (since our tools wrapper is written in Python), maybe I can just import and set via the hostile.utils namespace(?). If possible, that ought to bypass any platformutils calls, right? It doesn't really matter for me, since our server is Linux (not OSX), but . . .
I can play with it in a conda env, and report back. Thanks.
Ah, you're using the Python API, nice. In that case I would suggest setting the environment variable in Python:
import os
os.environ["XDG_DATA_HOME"] = "/data"
If for some reason this doesn't work, I can implement a direct override using e.g. a function argument.
Ah, you're using the Python API, nice. In that case I would suggest setting the environment variable in Python:
import os os.environ["XDG_DATA_HOME"] = "/data"
If for some reason this doesn't work, I can implement a direct override using e.g. a function argument.
Heh. That was my first thought (and what I meant, by "override the XDG_DATA_DIR"), but I seemed to talk myself out of it, thinking that was unnecessarily going out into the calling environment, only to allow loading back in. But maybe that ends up being identical (as part of any memory the Python instance already has loaded). At least it should be simple. Thanks.
I know where you're coming from! Environment variables are not especially pretty. If for some reason this doesn't work properly, let me know and we can sort it.
In the next release, the Right Way to do with will be to use the new HOSTILE_CACHE_DIR
environment variable which should work cross-platform. On Linux, XDG_DATA_HOME
will still be respected so long as HOSTILE_CACHE_DIR
is unset.
https://github.com/bede/hostile/commit/57be277e89204ef8e63ceeaa704d13a82da82dec
In the next release, the Right Way to do with will be to use the new
HOSTILE_CACHE_DIR
environment variable which should work cross-platform. On Linux,XDG_DATA_HOME
will still be respected so long asHOSTILE_CACHE_DIR
is unset.
Nice! Thanks!
BTW, I think os.environ is just a dictionary. So, I think, you should just be able to just use the second argument of get
to set the default fallback, rather than using if
.
setdefault
is another useful one.
It doesn't look like setting . . .
os.environ['XDG_DATA_HOME']=. . .
. . . works for the Python API, for me.
My script reports the correct (overridden) value of . . .
f"os.environ['XDG_DATA_HOME']=}"
. . . but . . .
15:18:25 INFO: Saved human index (~/.local/share/hostile/human-t2t-hla)
Anyway, just thought I'd follow-up and let you know that.
Thanks.
Thank you for this and sorry for delay, I've been travelling this week. Which OS and Python version are you using? I'll look into it.
On Thu, 14 Mar 2024 at 20:22, Brian Klahn @.***> wrote:
It doesn't look like setting . . .
os.environ['XDG_DATA_HOME']=. . .
. . . works for the Python API, for me.
My script reports the correct (overridden) value of . . .
f"os.environ['XDG_DATA_HOME']=}"
. . . but . . . 15:18:25 INFO: Saved human index (~/.local/share/hostile/human-t2t-hla)
Anyway, just thought I'd follow-up and let you know that.
Thanks.
— Reply to this email directly, view it on GitHub https://github.com/bede/hostile/issues/32#issuecomment-1998386004, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHWAABIEHB5HWIDBREI5FLYYIBIZAVCNFSM6AAAAABEF6IJUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJYGM4DMMBQGQ . You are receiving this because you commented.Message ID: @.***>
Thank you for this and sorry for delay, I've been travelling this week. Which OS and Python version are you using? I'll look into it.
No worries. I've been traveling some, lately, as well.
GNU/Linux with Python 3.10 We're in a Singularity container, within a Conda env. Looks like that is CentOS 6.10 (on a CentOS 6 host, I believe). The Conda env appears to be running Python 3.10.13. Thanks.
Hi Brian,
Thank you for your patience with this, I'm home from a second trip now – an unusual month for me. I'm afraid that I am not able to replicate your issue with an Ubuntu 22 VM, and testing the latest unreleased commit mentioned earlier allows the cache location to be overriden either using the legacy now-understood-to-be-linux-only $XDG_DATA_HOME
, or the replacement and cross-platform $HOSTILE_CACHE_DIR
If it's possible for you to follow the instructions in the readme to create a development build with the latest changes, I would be very interested to know if things are working for you. Nevertheless I will probably create a new release soon.
This change has been released in 1.1.0
This change has been released in
1.1.0
I'll just try it out after our service container is updated to the new package. You can close this, if you want. I'll just open another issue, if necessary.
Also, I don't think its necessary/appropriate to transfer the minor version, when bumping the major.
Also, I don't know if you knew a Python dict get method has a default parameter. No need to implement if else logic manually. https://docs.python.org/3.11/library/stdtypes.html#dict.get
Thanks for the package update!
Also, I don't think its necessary/appropriate to transfer the minor version, when bumping the major.
You are mistaken – I neither bumped the major version nor 'transferred' the minor version in this release.
Also, I don't know if you knew a Python dict get method has a default parameter. No need to implement if else logic manually. https://docs.python.org/3.11/library/stdtypes.html#dict.get
Yes, and Hostile relies on it in places, feel free to submit a PR if you can simplify it without failing tests.
Thanks for the package update!
A pleasure! Feel free to reopen etc. if the problem persists.
You are mistaken – I neither bumped the major version nor 'transferred' the minor version in this release.
I guess I must be. For some reason I had the version pinned to 0.1.0 in our environment.yml file (which I thought you had bumped to 1.1.0). Maybe typo on my part, or I had come across that on an old conda web page. Sorry. Thanks.
No problem. Sometimes conda solvers install an ancient package version in trying to satisfy pinned dependency versions, so perhaps something along these lines happened. Do let me know if you spot this happening again.
I'm working on integrating this into a service. If I want to cache the index files to a non-user-specific shared location (e.g.), would the best way to do that be to maybe override the XDG_DATA_DIR?
https://github.com/bede/hostile/blob/82b3a5db22880f5ba11254a15ea846f0f5287427/src/hostile/util.py#L32
I was tracing through the code, and it looks like this isn't exposed as a parameter, like the output dir is. But am I missing something?
Thanks!