ytsaurus / ytsaurus-k8s-operator

Kubernetes operator for YTsaurus.
https://ytsaurus.tech
Other
35 stars 24 forks source link

Support new args for the init_queue_agent_state executable #350

Closed l0kix2 closed 3 weeks ago

l0kix2 commented 1 month ago
 job/yt-queue-agent-init-job-qa-state
++ export YT_DRIVER_CONFIG_PATH=/config/client.yson
++ YT_DRIVER_CONFIG_PATH=/config/client.yson
+++ /usr/bin/ytserver-all --version
+++ head -c4
++ export YTSAURUS_VERSION=24.1
++ YTSAURUS_VERSION=24.1
++ [[ -f /usr/bin/init_queue_agent_state ]]
++ /usr/bin/init_queue_agent_state --create-registration-table --create-replicated-table-mapping-table --recursive --ignore-existing --proxy http-proxies.nebius-alan.svc.testy.k8s.nebius.yt
usage: init_queue_agent_state [-h] [--proxy PROXY] [--root ROOT]
                              [--override-tablet-cell-bundle OVERRIDE_TABLET_CELL_BUNDLE]
                              [--shard-count SHARD_COUNT] [--force]
                              (--target-version TARGET_VERSION | --latest)
init_queue_agent_state: error: one of the arguments --target-version --latest is required
achulkov2 commented 1 month ago

We need to (easy):

We need to (more complicated):

achulkov2 commented 1 month ago

cc: @ItIsApachee @nadya002

ItIsApachee commented 1 month ago

сс @savnadya

l0kix2 commented 3 weeks ago

Also It may be good idea for all script have reasonable defaults, so in ytop we could call it without options (only passing the proxy), that way we won't end up in such situations.

l0kix2 commented 3 weeks ago

It seems we can't check a version here

++ export YT_DRIVER_CONFIG_PATH=/config/client.yson
++ YT_DRIVER_CONFIG_PATH=/config/client.yson
+++ /usr/bin/ytserver-all --version
+++ head -c4
++ export YTSAURUS_VERSION=24.1
++ YTSAURUS_VERSION=24.1
++ '[' '!' -f /usr/bin/init_queue_agent_state ']'
++ [[ ! 24.1 < 24.1 ]]
++ /usr/bin/init_queue_agent_state --latest --proxy http-proxies.querytrackeraco.svc.cluster.local
usage: init_queue_agent_state [-h] [--proxy PROXY] [--root ROOT]
                              [--registration-table-path REGISTRATION_TABLE_PATH]
                              [--replicated-table-mapping-table-path REPLICATED_TABLE_MAPPING_TABLE_PATH]
                              [--tablet-cell-bundle TABLET_CELL_BUNDLE]
                              [--skip-queues] [--skip-consumers]
                              [--skip-object-mapping]
                              [--create-registration-table]
                              [--create-replicated-table-mapping-table]
                              [--recursive] [--ignore-existing]

there are 24.1* versions with an old binary it seems. Will have to check if script fails and run it with different args

l0kix2 commented 3 weeks ago

For the newest server version ghcr.io/ytsaurus/ytsaurus-nightly:dev-2024-10-11-1ab0d5d4b54b30f0a9ea5f55ee32fd5bd6ab1e76 i have an error

++ YTSAURUS_VERSION=25.1
++ '[' '!' -f /usr/bin/init_queue_agent_state ']'
++ set +e
++ /usr/bin/init_queue_agent_state --latest --proxy http-proxies.querytrackeraco.svc.cluster.local
Traceback (most recent call last):
  File "/usr/bin/init_queue_agent_state", line 322, in <module>
    main()
  File "/usr/bin/init_queue_agent_state", line 312, in main
    migration.run(
  File "/usr/local/lib/python3.8/dist-packages/yt/environment/migrationlib/__init__.py", line 474, in run
    self._initialize_migration(client, tables_path=tables_path, version=current_version)
  File "/usr/local/lib/python3.8/dist-packages/yt/environment/migrationlib/__init__.py", line 331, in _initialize_migration
    self._create_table(client, table_info, ypath_join(tables_path, table_name), shard_count=shard_count)
  File "/usr/local/lib/python3.8/dist-packages/yt/environment/migrationlib/__init__.py", line 316, in _create_table
    table_info.create_dynamic_table(client, table_path)
  File "/usr/local/lib/python3.8/dist-packages/yt/environment/migrationlib/__init__.py", line 145, in create_dynamic_table
    client.create("table", path, recursive=True, attributes=attributes)
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/client_impl.py", line 521, in create
    return client_api.create(
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/cypress_commands.py", line 480, in create
    result = make_formatted_request("create", params, format=None, client=client)
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/driver.py", line 174, in make_formatted_request
    result = make_request(command_name, params,
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/driver.py", line 109, in make_request
    result = http_driver.make_request(
  File "<decorator-gen-3>", line 2, in make_request
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/common.py", line 482, in forbidden_inside_job
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/http_driver.py", line 288, in make_request
    response = make_request_with_retries(
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/http_helpers.py", line 459, in make_request_with_retries
    return RequestRetrier(method=method, url=url, **kwargs).run()
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/retries.py", line 89, in run
    return self.action()
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/http_helpers.py", line 414, in action
    _raise_for_status(response, request_info)
  File "/usr/local/lib/python3.8/dist-packages/yt/wrapper/http_helpers.py", line 294, in _raise_for_status
    raise error_exc
yt.common.YtResponseError: Node //sys/queue_agents/queues already exists

this should be fixed in https://github.com/ytsaurus/ytsaurus/issues/886

l0kix2 commented 3 weeks ago

tl;dr for future research

// this 24.1 has old qa init script so we can't check version
// "ytsaurus/ytsaurus-nightly:dev-24.1-70487-f6622682d3810dd8972be1739e678821541ae80e"
// this 24.1 one has new qa init script and works with `--latest` only arg
// "ghcr.io/ytsaurus/ytsaurus-nightly:dev-24.1-2024-10-07-dd30c05044499bddd66c26077d0e65d92866f5f2"
// this ones (24.2, 25.1) fails on existing table (since operator creates `//sys/queue_agents`)
// ghcr.io/ytsaurus/ytsaurus-nightly:dev-24.2-2024-10-11-3217ec1e392876a973127d102481e5f6745f42a2"
// "ghcr.io/ytsaurus/ytsaurus-nightly:dev-2024-10-11-1ab0d5d4b54b30f0a9ea5f55ee32fd5bd6ab1e76"
achulkov2 commented 3 weeks ago

IMO we can consider 24.1 versions before the cherry-pick "deprecated" and stick with checking versions. There is no public release of 24.1 yet anyway.

l0kix2 commented 3 weeks ago

IMO we can consider 24.1 versions before the cherry-pick "deprecated" and stick with checking versions. There is no public release of 24.1 yet anyway.

I did a fix as described, but we can simplify things and check version, yes.