huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.1k stars 551 forks source link

Support multiple tokens locally #2446

Closed Wauplin closed 1 month ago

Wauplin commented 2 months ago

Originally from @osanseviero on slack (private)

With the big push for fine-grained tokens, I'm wondering if we should have a way for users to have multiple tokens easily in their environment and switch between them rather than overriding each time

Originally from @julien-c in reply

aws-cli has something like this

we could add a flag for --profile token_name or --token-name "My token to list private models"

Originally from @osanseviero as well

On my side I stopped using Colab secret management entirely as it doesn't make sense in the fine-grained tokens world. I'm again signing in each time with a new key


We would need:

Google Colab secret management is a separate topic but still relevant.

lappemic commented 2 months ago

Hey @Wauplin, i would love to work on this one! To be aligned this would be my approach:

Does this sound in general correct?

Looking forward to your feedback and having finally time to work on another hf issue!

Wauplin commented 1 month ago

Hi @lappemic, nice to hear back from you :) Thanks for the suggestions. Unfortunately we've decided internally to move forward on it with @hanouticelina who joined us last week and will maintain core parts of the library. She's already done a comparison work with aws-cli, gh, etc. and has started to work on a PR. Maybe @hanouticelina you could share here or on a first PR the direction you are taking? (once you have something consolidated).

@lappemic in the meantime, would you be interested to work on the huggingface-cli delete-cache command instead. It's a command line tool to clean your HF cache locally that has been implemented ~18 months ago. It has become quite practical in the ecosystem but can still be improved. We've got great feedback in https://github.com/huggingface/huggingface_hub/issues/1997 and also discussed improvements in https://github.com/huggingface/huggingface_hub/issues/1065. The goal is not to change everything but to add a few improvements to save users some time when they clean-up their cache. Typically:

Would you like to look into it and work on some improvements? Ofc, not everything has to be done at once. If yes, let's continue the discussion in a #1997 :)

hanouticelina commented 1 month ago

Hi 👋 As discussed internally, we drafted a first version of the CLI usage documentation. The idea is to keep using HF_TOKEN and HF_TOKEN_PATH (i.e. ~/cache/huggingface/token) when switching between tokens.

Usage

Login

huggingface-cli login [--token TOKEN] [--profile PROFILE_NAME]

List Profiles

huggingface-cli auth list

Logout

huggingface-cli logout [--profile PROFILE_NAME --all]

Switching the current active profile

huggingface-cli auth switch PROFILE_NAME

How it Works

Token storage

Tokens will be stored in ~/.cache/huggingface/profiles (or HF_PROFILES_PATH env variable if set), which is INI format file:

[default] 
hf_token = hf_XXXXXXX 
[profile1] 
hf_token = hf_YYYYYYY
[profile2] 
hf_token = hf_ZZZZZZZ

Token Selection Priority

We will keep the same token retrieval priority order

  1. HF_TOKEN env variable.
  2. Token file ~/.cache/huggingface/token. The token stored in this file will be overrided when switching between profiles.

Implementation Plan

I already started some experimentation locally with the following implementation (still need to be refined and discussed in the PR):

def _save_token_to_profiles(token: str, profile_name: str) -> None: """ Save the token to the profile. Parses the ini file, if the profile does not exist, it creates it. Otherwise, it updates the token. """ profiles_path = Path(constants.HF_PROFILES_PATH) ...

def _get_token_from_profiles(profile_name: str) -> str: """ Get the token from the profile. Parses the ini file and returns the token. If the token is not found, it raises an error. Otherwise, it returns the token. """ ...

def _set_active_profile(profile_name: str) -> None: token = _get_token_from_profiles(profile_name)

Write token to HF_TOKEN_PATH

path = Path(constants.HF_TOKEN_PATH)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(token)
print(f"Your token has been saved to {constants.HF_TOKEN_PATH}")
Other function helpers will be added to write/parse the profiles file.
- [ ] Update logout() to add a `profile` and a `all` boolean parameter:
```python
def logout(profile_name: Optional[str] = None, all: bool = False) -> None:
    profiles = ...  # Read profiles from file
    profile_name = profile_name or "default"
    if all:
        # delete profiles file and token file
        return

    if profile_name in profiles:
        # Remove profile from profiles file
        # If the active token matches this profile's token, delete it + Add a warning to the user that the active token will be deleted
        if profiles[profile_name] == get_token():
            warnings.warn(f"Active Profile '{profile_name}' will be deleted.")
        print(f"Profile '{profile_name}' has been deleted.")
    else:
        # Raise error profile '{profile_name}' does not exist.
Wauplin commented 1 month ago

Logout

If --profile is not specified, logs out from the default one.

I'd say, if --profile is not specified, logs out from the current one.

Wauplin commented 1 month ago

Otherwise code looks good but next time don't hesitate to open the PR directly even if the code is not complete (draft PRs exist for that^^). It makes it easier to try things and suggest small changes.