cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.46k stars 2.99k forks source link

How to use cvat-sdk with token #7439

Open liudaolunboluo opened 8 months ago

liudaolunboluo commented 8 months ago

Actions before raising this issue

Is your feature request related to a problem? Please describe.

When creating a client in cvat-sdk you need to enter the username and password but now I have multiple users and I don't know their passwords but I can get their tokens, can I use this token directly to use the cvat-sdk client?

Describe the solution you'd like

I us token directly to use the cvat-sdk client

Describe alternatives you've considered

No response

Additional context

No response

zhiltsov-max commented 8 months ago

Hi, there is no "official way", but please check this answer and this draft PR.

liudaolunboluo commented 8 months ago

Hi, there is no "official way", but please check this answer and this draft PR.

Are sessionid and csrftoken necessary? Because my scenario is that I deployed a privatized cvat myself and integrated him into my system, and I want to automate the import of the dataset in one of my system's features, so I shouldn't be able to get the cross-domain cookie information in the browser

zhiltsov-max commented 8 months ago

If you're working as a local admin, you should be able to download annotations from other users. Consider using this approach instead. Would a special service account for these purposes work for you?

liudaolunboluo commented 8 months ago

If you're working as a local admin, you should be able to download annotations from other users. Consider using this approach instead. Would a special service account for these purposes work for you?

I only want to upload dataset to project,now I think I can use normal user create project and use admin user upload datset,because I have admin user password,but question is high level sdk have not only import dataset api,it only have create project from datset api. so what can i do?

liudaolunboluo commented 8 months ago

If you're working as a local admin, you should be able to download annotations from other users. Consider using this approach instead. Would a special service account for these purposes work for you?

oh,sory,I read this manual: https://opencv.github.io/cvat/docs/api_sdk/sdk/highlevel-api/,That's very helpful., it worked: project = client.projects.retrieve(project_id)

liudaolunboluo commented 8 months ago

I'm having a little problem again. My code:

with make_client(
        host="my local server url",
        credentials=('admin', 'admin password')

) as client:
    try:
        project = client.projects.retrieve(project_id)
    except Exception as e:
        print('throw exception:', e)
    project.import_dataset(format_name='CVAT 1.1', filename='dataset file path', pbar=pbar)

it throw exception: HTTP response body: b'{"detail":"CSRF Failed: CSRF token missing or incorrect."}' I've noticed that in some issues uploading on the page may encounter this problem, just try logging out and logging in, now in the sdk I'm still encountering this problem, how can I fix it? Thank you very much for your reply!

liudaolunboluo commented 8 months ago

I solved it! It looks like it was a code issue, there was no X-Csrftoken set for the header anywhere in the code for importing the dataset, so it was reporting an error:https://stackoverflow.com/questions/26639169/csrf-failed-csrf-token-missing-or-incorrect ,I'm now changing it this way and it doesn't report an error:

        pairs = client.api_client.get_common_headers()['Cookie'].split('; ')
        dictionary = {pair.split('=')[0]: pair.split('=')[1] for pair in pairs}
        client.api_client.set_default_header("X-Csrftoken", dictionary['csrftoken'])
        client.api_client.get_common_headers()

The core is to get the csrftoken from the cookie and set the X-CSRFToken to the header. but But there is a new error reported:

  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/cvat_sdk/core/proxies/projects.py", line 57, in import_dataset
    DatasetUploader(self._client).upload_file_and_wait(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/cvat_sdk/core/uploading.py", line 321, in upload_file_and_wait
    rq_id = json.loads(response.data).get("rq_id")
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

code is:rq_id = json.loads(response.data).get("rq_id") But response.data doesn't look like a json string but a byte array But Data set imported successfully

liudaolunboluo commented 8 months ago

I solved it again,But the sdk api does have problems too, hopefully it will be fixed soon!

project = client.projects.retrieve(project_id)
filename = Path(dataset_path)
params = {"format": 'CVAT 1.1', "filename": filename.name}
url = client.api_map.make_endpoint_url(project.api.create_dataset_endpoint.path, kwsub={"id": project_id})
response = DatasetUploader(client).upload_file(
                url, filename, pbar=pbar, query_params=params, meta={"filename": params["filename"]})
zhiltsov-max commented 8 months ago

Thank you for reporting the issues.

But response.data doesn't look like a json string but a byte array

Probably, you're using an older version of CVAT, before the change was introduced. Please make sure SDK and server versions match. The change was introduced in https://github.com/opencv/cvat/pull/5909.

liudaolunboluo commented 8 months ago

Thank you for reporting the issues.

But response.data doesn't look like a json string but a byte array

Probably, you're using an older version of CVAT, before the change was introduced. Please make sure SDK and server versions match. The change was introduced in #5909.

I use v2.3.0 sdk,and use import_dataset method,but it throw:raise tus_uploader.TusCommunicationError( tusclient.exceptions.TusCommunicationError: Attempt to retrieve offset failed with status 200

Abo-Omar-74 commented 7 months ago

I am looking forward to contributing to solving this issue through GSOC 24 and I have some questions regarding the scope of this project.

Abo-Omar-74 commented 7 months ago

I am looking forward to contributing to solving this issue through GSOC 24 and I have some questions regarding the scope of this project.

  • Scope Clarification:

    • According to the documentation, there is already a way of Authentication using tokens. Does this mean that the scope of this problem is to make the process of creating these tokens easier with the UI?
  • Integration with auth_api:

    • Do I need to add a new method to auth_api for token generation, or utilize the existing login method since it returns an access token?
  • Display of Access Tokens:

    • Should the access token be directly visible in the account settings, allowing users to copy and manually include it in the request headers using api_client.set_default_header("Authorization", "Token " + {generated token})?
    • Alternatively, should the UI indicate successful addition of the access token and automate its inclusion in environment variables for seamless integration into request headers?

I would greatly appreciate your guidance in clarifying this. Your expertise would be incredibly valuable. I truly appreciate your support.

@zhiltsov-max @SpecLad @azhavoro

zhiltsov-max commented 7 months ago

@Abo-Omar-74,

Hi, thank you for reaching us out about this topic. Overall, the scope for GSoC can vary, depending on your skills and desire. This is a complex task, that can be split into several elements:

  1. SDK / CLI update to support persistent login. Can be done with any type of keys. Basically, it means integrating this code snippet into our current CLI:
This code allows to record user credentials in the user profile directory. Credentials for each visited CVAT host are recorded, to allow future visits without explicit authentication from the user. This is similar to what you would find in AWS s3 CLI or Google Cloud Storage's CLI. ```python from __future__ import annotations import json import os from argparse import ArgumentParser from functools import partial from pathlib import Path from types import SimpleNamespace from typing import Optional, Tuple import attrs import cvat_sdk API_KEY_VAR = "CVAT_API_KEY" API_SESSIONID_VAR = "CVAT_API_SESSIONID" API_CSRFTOKEN_VAR = "CVAT_API_CSRFTOKEN" def add_cli_parser_args(parser: ArgumentParser) -> ArgumentParser: parser.add_argument("--org") parser.add_argument("--host", default="https://app.cvat.ai") parser.add_argument("--port") parser.add_argument( "--login", help=f"A 'login:password' pair. " f"Default: use {API_KEY_VAR}, {API_SESSIONID_VAR}, {API_CSRFTOKEN_VAR} env vars", ) parser.add_argument("--profile-dir", default=None, type=Path, help="User profile dir") return parser DEFAULT_PROFILE_FILENAME = "profile.json" DEFAULT_PROFILE_DIR = "~/cvat/" @attrs.define class UserCredentials: token: str sessionid: str csrftoken: str @attrs.define class UserProfile: _PROFILE_UMASK = 0o600 @classmethod def get_default_path(cls) -> Path: return Path(DEFAULT_PROFILE_DIR).expanduser() / DEFAULT_PROFILE_FILENAME path: Optional[Path] = attrs.field(factory=partial(get_default_path.__func__, __build_class__)) credentials: dict[str, UserCredentials] = attrs.field(factory=dict) extra: dict[str, str] = attrs.field(factory=dict) @classmethod def parse(cls, path: Path) -> UserProfile: data = json.loads(path.read_text()) return UserProfile( credentials={k: UserCredentials(**v) for k, v in data["credentials"].items()}, extra=data.get("extra"), ) @classmethod def load(cls, path: Optional[Path] = None) -> UserProfile: path = path or cls.get_default_path() if path.is_file(): if (mode := path.stat().st_mode) & cls._PROFILE_UMASK != cls._PROFILE_UMASK: raise Exception(f"Invalid profile mode. Expected 600 (rw--), got {oct(mode)}") profile = cls.parse(path) else: profile = UserProfile(path=path) profile.save(path) return profile def save(self, path: Optional[Path] = None): path = path or self.path or self.get_default_path() data = json.dumps( attrs.asdict(self, filter=lambda a, v: a.name != "path", recurse=True), indent=2 ) if path.absolute() == self.get_default_path().absolute() and not path.parent.is_dir(): path.parent.mkdir(mode=0o700, parents=True, exist_ok=True) with path.open("w", encoding="utf-8") as f: f.write(data) path.chmod(self._PROFILE_UMASK) def update_credentials(self, host: str, *, token: str, sessionid: str, csrftoken: str): self.credentials[self._make_host_key(host)] = UserCredentials( token=token, sessionid=sessionid, csrftoken=csrftoken ) def has_credentials(self, host: str) -> bool: return self._make_host_key(host) in self.credentials def get_credentials(self, host: str) -> UserCredentials: return self.credentials[self._make_host_key(host)] def _make_host_key(self, host: str) -> str: return host.split("://", maxsplit=1)[-1] class UserClient(cvat_sdk.Client): def __init__(self, *args, profile: Optional[UserProfile] = None, **kwargs) -> None: super().__init__(*args, **kwargs) self.profile = profile or UserProfile.load() def login(self, credentials: Tuple[str, str]) -> None: super().login(credentials) self.save_current_credentials() def load_credentials(self): credentials = self.profile.get_credentials(self.api_map.host) self.api_client.set_default_header("Authorization", f"Token {credentials.token}") self.api_client.cookies["sessionid"] = credentials.sessionid self.api_client.cookies["csrftoken"] = credentials.csrftoken self.api_client.set_default_header("X-Csrftoken", credentials.csrftoken) def save_current_credentials(self): self.profile.update_credentials( host=self.api_map.host, token=self.api_client.default_headers["Authorization"].split("Token ")[-1], sessionid=self.api_client.cookies["sessionid"].value, csrftoken=self.api_client.cookies["csrftoken"].value, ) self.profile.save() def make_client_from_cli(parsed_args: SimpleNamespace) -> UserClient: profile = UserProfile.load(parsed_args.profile_dir) host = parsed_args.host port = parsed_args.port url = host.rstrip("/") if port: url = f"{url}:{port}" with UserClient(url, profile=profile) as client: if parsed_args.org: client.organization_slug = parsed_args.org if parsed_args.login: client.login(parsed_args.login.split(":", maxsplit=1)) elif api_key := os.getenv(API_KEY_VAR): client.api_client.set_default_header("Authorization", f"Token {api_key}") client.api_client.cookies["sessionid"] = os.getenv(API_SESSIONID_VAR) client.api_client.cookies["csrftoken"] = os.getenv(API_CSRFTOKEN_VAR) client.api_client.set_default_header("X-Csrftoken", os.getenv(API_CSRFTOKEN_VAR)) client.save_current_credentials() elif profile.has_credentials(client.api_map.host): client.load_credentials() return client ```

I can see it can be extended with an auth CLI command, that allows just to login on a host and store tokens locally for further use. It can also be extended with a command to remove any of the recorded tokens from the local profile.

  1. Server updates to support manageable API token generation for a user. Currently, CVAT also uses tokens, but the difference is that these new tokens could be manageable - i.e. can be created, revoked, have expiration time etc. Current tokens are mostly for the UI to work, so are obtained after the login.
  2. UI updates to support API token management options in the personal account page (basically, requires to add such page first, as there is no such section in CVAT yet). Probably, can work similarly to what's on GitHub.
  3. SDK / CLI updates to support API tokens for auth - similar to the existing login/password pair

Basically, I'd propose to start with the first task from this list. It already has a PoC implementation, doesn't require too many changes, and just needs to be productized - i.e. with tests and convenient user interface.

ritikraj26 commented 7 months ago

Hi, I'm interested in contributing to this project. My understanding is that the goal is to implement a system within user profiles that allows for the generation of API access tokens. Users should be able to store these tokens locally for persistent authorization, similar to how GitHub handles personal access tokens. Can you confirm if this understanding is correct?

@zhiltsov-max @nmanovic

zhiltsov-max commented 7 months ago

@ritikraj26, hi! Yes, your understanding is correct.

ritikraj26 commented 7 months ago

Currently, the user is obtaining the auth token using the username and password. We need to just replace the current method with the API access tokens generated by the user. The rest of the flow remains the same. Only the method to obtain the auth token is updated? @zhiltsov-max

ritikraj26 commented 7 months ago

@zhiltsov-max Where can I connect with you? Would you be open to some discussion? I am keenly interested in contributing to this project.

zhiltsov-max commented 6 months ago

@ritikraj26, I think it's best to be discussed here for others to see and participate, unless you're going to send your GSoC proposal. Proposals need to be sent via the GSoC site.

The rest of the flow remains the same. Only the method to obtain the auth token is updated?

Basically - yes, but "The rest of the flow" needs clarification. If you meant the point 1 from the comment above, then I think it's quite close to what's expected from the token use point of view. But tokens can also be managed in CVAT as discussed in the comment.

ritikraj26 commented 2 days ago

Hi, I recently switched from a ubuntu to mac, I am having some trouble setting up.

Screenshot 2024-10-25 at 6 13 38 PM

Some help please. I am looking forward to complete this project.

azhavoro commented 2 days ago

Hi, I recently switched from a ubuntu to mac, I am having some trouble setting up. Screenshot 2024-10-25 at 6 13 38 PM

Some help please. I am looking forward to complete this project.

https://stackoverflow.com/questions/69818376/localhost5000-unavailable-in-macos-v12-monterey

ritikraj26 commented 14 hours ago

Hi, I recently switched from a ubuntu to mac, I am having some trouble setting up. Screenshot 2024-10-25 at 6 13 38 PM Some help please. I am looking forward to complete this project.

https://stackoverflow.com/questions/69818376/localhost5000-unavailable-in-macos-v12-monterey

This issue is resolved. However, I am still not able to see an requests on the debug server.

Screenshot 2024-10-27 at 4 02 43 PM Screenshot 2024-10-27 at 4 03 20 PM