golemfactory / ya-relay

GNU General Public License v3.0
3 stars 6 forks source link

Approved agreement not changing state and cannot be deleted #343

Closed stan7123 closed 8 months ago

stan7123 commented 8 months ago

I have an agreement on yagna which was approved and did not change its state for a few days now, even when yagna was disconnected and restarted multiple times. I don't know if it should change its state in this situation. But, when trying to terminate it I receive an error.

AgreementId: da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8 Yagna version: yagna 0.14.0 (9f9be57b 2023-12-07 build #365)

Yagna logs from creating the agreement (no other mentions about this agreement since then):

[2024-02-26T19:23:22.842+0000 INFO  ya_relay_client::session] ReverseConnection - got session with node: [0x21d9bfa212014d9bf5cfcef2345e40447a602748]
[2024-02-26T19:23:22.904+0000 INFO  ya_market::negotiation::common] Received counter Proposal [R-ba9ac55f2e4bafb791b79cbb1387bedbad27930eb659b35ec8bd13d21ac67d49] for Proposal [R-531467cae160
90b8d86dd639d0d40a71f70c838ce3c2df627ede5507e503239c] from [0x21d9bfa212014d9bf5cfcef2345e40447a602748].
[2024-02-26T19:23:22.945+0000 INFO  ya_market::negotiation::requestor] Requestor 'autoconfigured' [0x078de4ffddde3b129de4f3861c392ee96bc77197] created Agreement [R-da5eb50ede2744456bdc551358e
e7afd4b16ea50315ac7b6d91a5d343e00e1c8] from Proposal [R-ba9ac55f2e4bafb791b79cbb1387bedbad27930eb659b35ec8bd13d21ac67d49].
[2024-02-26T19:23:22.965+0000 INFO  ya_market::negotiation::requestor] Requestor 'autoconfigured' [0x078de4ffddde3b129de4f3861c392ee96bc77197] confirmed Agreement [R-da5eb50ede2744456bdc55135
8ee7afd4b16ea50315ac7b6d91a5d343e00e1c8] and sent to Provider.
[2024-02-26T19:23:22.965+0000 INFO  ya_market::negotiation::requestor] AppSession id [718def7baddf40219f2ea2570df687f8] set for Agreement [R-da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8].
[2024-02-26T19:23:22.969+0000 INFO  ya_market::negotiation::requestor] Agreement [R-da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8] approved by [0x21d9bfa212014d9bf5cfcef2345e40447a602748]. Committing...
[2024-02-26T19:23:22.977+0000 INFO  ya_market::negotiation::requestor] Agreement [R-da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8] committed (approved) by [0x21d9bfa212014d9bf5cfcef2345e40447a602748].
[2024-02-26T19:23:23.753+0000 INFO  ya_activity::requestor::control] Created Activity [0502b4ca1cd44abf85118b56cc23d6ae] for Agreement [da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8]
[2024-02-26T19:23:23.755+0000 INFO  ya_vpn::network] Creating network: e3aa6160085d4967b4685e524a6d3ef7 (10.1.195.145/31)
[2024-02-26T19:23:23.755+0000 INFO  ya_vpn::network] VPN e3aa6160085d4967b4685e524a6d3ef7 started
[2024-02-26T19:23:23.755+0000 INFO  ya_vpn::network] Network: e3aa6160085d4967b4685e524a6d3ef7 assigning new ip address: 10.1.195.144 for identity: 0x078de4ffddde3b129de4f3861c392ee96bc77197
[2024-02-26T19:23:23.756+0000 INFO  ya_vpn::network] Adding Node: 10.1.195.145 to network: e3aa6160085d4967b4685e524a6d3ef7

Agreement details:

root@db7296ed83c3:~/.local/share/yagna# yagna market agreements get --id da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8 --role Requestor
agreementId: da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8
appSessionId: 718def7baddf40219f2ea2570df687f8
approvedDate: 2024-02-26T19:23:22.964128218Z
approvedSignature: NoSignature
committedSignature: NoSignature
demand:
  constraints: "(&(golem.node.debug.subnet=gpu-test2)\n\t(golem.com.payment.platform.erc20next-goerli-tglm.address=*)\n\t(golem.runtime.name=vm-nvidia)\n\t(&(golem.runtime.capabilities=vpn)\n\t(golem.runtime.capabilities=inet)\n\t(golem.runtime.capabilities=manifest-support))\n\t(golem.inf.mem.gib>=16)\n\t(golem.inf.storage.gib>=20)\n\t(golem.inf.cpu.threads>=2))"
  demandId: 1db3f26e073345459934bfa3070ce5eb-41b1ab1e48daf06cd81ce6675a5ce358030299c39e0f8f4def7d4e8f3dbedebc
  properties:
    golem.com.payment.chosen-platform: erc20next-goerli-tglm
    golem.com.payment.debit-notes.accept-timeout?: 120
    golem.com.payment.platform.erc20next-goerli-tglm.address: 0x078de4ffddde3b129de4f3861c392ee96bc77197
    golem.node.debug.subnet: gpu-test2
    golem.srv.caps.multi-activity: true
    golem.srv.comp.expiration: 1709011402631
    golem.srv.comp.payload: ewogICJ2ZXJzaW9uIjogIjAuMS4wIiwKICAiY3JlYXRlZEF0IjogIjIwMjMtMTAtMjBUMTI6NTE6MDAuMDAwMDAwWiIsCiAgImV4cGlyZXNBdCI6ICIyMTAwLTAxLTAxVDAwOjAxOjAwLjAwMDAwMFoiLAogICJtZXRhZGF0YSI6IHsKICAgICJuYW1lIjogInNjYWxlcG9pbnRhaS9hdXRvbWF0aWMxMTExOjQiLAogICAgImRlc2NyaXB0aW9uIjogInNjYWxlcG9pbnRhaS9hdXRvbWF0aWMxMTExOjQiLAogICAgInZlcnNpb24iOiAiMC4xLjAiCiAgfSwKICAicGF5bG9hZCI6IFsKICAgIHsKICAgICAgInBsYXRmb3JtIjogewogICAgICAgICJhcmNoIjogIng4Nl82NCIsCiAgICAgICAgIm9zIjogImxpbnV4IgogICAgICB9LAogICAgICAidXJscyI6IFsKICAgICAgICAiaHR0cDovL3JlZ2lzdHJ5LmdvbGVtLm5ldHdvcmsvZG93bmxvYWQvNjg4ZDVlMzhjNmJmYWEyODRiYzVkNGM5YzQyMmJiNzk0MjNiNDI0NDQyZGQ3NzUwYWQ5YmYyYTkyZTBlODJkNyIKICAgICAgXSwKICAgICAgImhhc2giOiAic2hhMzpkOGQ0N2NjY2NmYzI0MmQ0NmI4NjA4MTllZmNkZDRkZDE3YjRmYjUxMDM0MjAwM2ZlOTlkNzQ1OCIKICAgIH0KICBdLAogICJjb21wTWFuaWZlc3QiOiB7CiAgICAidmVyc2lvbiI6ICIwLjEuMCIsCiAgICAic2NyaXB0IjogewogICAgICAiY29tbWFuZHMiOiBbCiAgICAgICAgInJ1biAuKiIsCiAgICAgICAgInRyYW5zZmVyIC4qIgogICAgICBdLAogICAgICAibWF0Y2giOiAicmVnZXgiCiAgICB9LAogICAgIm5ldCI6IHsKICAgICAgImluZXQiOiB7CiAgICAgICAgIm91dCI6IHsKICAgICAgICAgICJwcm90b2NvbHMiOiBbImh0dHBzIl0sCiAgICAgICAgICAidXJscyI6IFsiaHR0cHM6Ly9odWdnaW5nZmFjZS5jbyIsImh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jbyIsImh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvIiwiaHR0cHM6Ly9ncHUtcHJvdmlkZXIuZGV2LmdvbGVtLm5ldHdvcmsiXQogICAgICAgIH0KICAgICAgfQogICAgfQogIH0KfQo=
    golem.srv.comp.vm.package_format: gvmkit-squash
    maintenance_port: 8001
    service_port: 8000
  requestorId: 0x078de4ffddde3b129de4f3861c392ee96bc77197
  timestamp: 2024-02-26T19:23:22.944773310Z
offer:
  constraints: |-
    (&
      (golem.srv.comp.expiration>1708975336624)
      (golem.node.debug.subnet=gpu-test2)
    )
  offerId: ea393f1f8baf41df86b9a2bc642992d1-f3a22e9b90e4a77df99c9a53409fe911ee10a5f6a0ed7211f48f7f90d5135d17
  properties:
    golem.!exp.gap-35.v1.inf.gpu.clocks.graphics.mhz: 2115
    golem.!exp.gap-35.v1.inf.gpu.clocks.memory.mhz: 9751
    golem.!exp.gap-35.v1.inf.gpu.clocks.sm.mhz: 2115
    golem.!exp.gap-35.v1.inf.gpu.clocks.video.mhz: 1950
    golem.!exp.gap-35.v1.inf.gpu.cuda.cores: 10496
    golem.!exp.gap-35.v1.inf.gpu.cuda.enabled: true
    golem.!exp.gap-35.v1.inf.gpu.cuda.version: '12.0'
    golem.!exp.gap-35.v1.inf.gpu.memory.bandwidth.gib: 936
    golem.!exp.gap-35.v1.inf.gpu.memory.total.gib: 24.0
    golem.!exp.gap-35.v1.inf.gpu.model: NVIDIA GeForce RTX 3090
    golem.activity.caps.transfer.protocol:
    - https
    - http
    - gftp
    golem.com.payment.debit-notes.accept-timeout?: 120
    golem.com.payment.platform.erc20-goerli-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.payment.platform.erc20-holesky-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.payment.platform.erc20-mumbai-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.payment.platform.erc20next-goerli-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.payment.platform.erc20next-holesky-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.payment.platform.erc20next-mumbai-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.payment.platform.erc20next-rinkeby-tglm.address: 0x10e6c5ae67b8c12c30b4139ad7cebecae1917e7c
    golem.com.pricing.model: linear
    golem.com.pricing.model.linear.coeffs:
    - 0.0
    - 0.0002777777777777778
    - 0.0
    golem.com.scheme: payu
    golem.com.usage.vector:
    - golem.usage.cpu_sec
    - golem.usage.duration_sec
    golem.inf.cpu.architecture: x86_64
    golem.inf.cpu.brand: 13th Gen Intel(R) Core(TM) i5-13400F
    golem.inf.cpu.capabilities:
    - sse3
    - pclmulqdq
    - dtes64
    - monitor
    - dscpl
    - vmx
    - eist
    - tm2
    - ssse3
    - fma
    - cmpxchg16b
    - pdcm
    - pcid
    - sse41
    - sse42
    - x2apic
    - movbe
    - popcnt
    - tsc_deadline
    - aesni
    - xsave
    - osxsave
    - avx
    - f16c
    - rdrand
    - fpu
    - vme
    - de
    - pse
    - tsc
    - msr
    - pae
    - mce
    - cx8
    - apic
    - sep
    - mtrr
    - pge
    - mca
    - cmov
    - pat
    - pse36
    - clfsh
    - ds
    - acpi
    - mmx
    - fxsr
    - sse
    - sse2
    - ss
    - htt
    - tm
    - pbe
    - fsgsbase
    - adjust_msr
    - bmi1
    - avx2
    - fdp
    - smep
    - bmi2
    - rep_movsb_stosb
    - invpcid
    - deprecate_fpu_cs_ds
    - rdseed
    - adx
    - smap
    - clflushopt
    - processor_trace
    - sha
    - clwb
    - umip
    - pku
    - ospke
    - rdpid
    golem.inf.cpu.cores: 10
    golem.inf.cpu.model: Stepping 2 Family 6 Model 367
    golem.inf.cpu.threads: 15
    golem.inf.cpu.vendor: GenuineIntel
    golem.inf.mem.gib: 57.81822070479393
    golem.inf.storage.gib: 708.7353454589844
    golem.node.debug.subnet: gpu-test2
    golem.node.id.name: g4-3090
    golem.node.net.is-public: false
    golem.runtime.capabilities:
    - inet
    - vpn
    - manifest-support
    - start-entrypoint
    - '!exp:gpu'
    golem.runtime.name: vm-nvidia
    golem.runtime.version: 0.1.3-rc10
    golem.srv.caps.multi-activity: true
    golem.srv.caps.payload-manifest: true
  providerId: 0x21d9bfa212014d9bf5cfcef2345e40447a602748
  timestamp: 2024-02-26T19:23:22.944773310Z
proposedSignature: NoSignature
state: Approved
timestamp: 2024-02-26T19:23:22.944773310Z
validTo: 2024-02-26T19:24:22.944263Z

Error when trying to terminate it using golem-core-python:

golem_worker_api-1  | 2024-02-28 12:22:09,064 - uvicorn.error - ERROR - Exception in ASGI application
golem_worker_api-1  | Traceback (most recent call last):
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 404, in run_asgi
golem_worker_api-1  |     result = await app(  # type: ignore[func-returns-value]
golem_worker_api-1  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
golem_worker_api-1  |     return await self.app(scope, receive, send)
golem_worker_api-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
golem_worker_api-1  |     await super().__call__(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/applications.py", line 123, in __call__
golem_worker_api-1  |     await self.middleware_stack(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 186, in __call__
golem_worker_api-1  |     raise exc
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py", line 164, in __call__
golem_worker_api-1  |     await self.app(scope, receive, _send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
golem_worker_api-1  |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
golem_worker_api-1  |     raise exc
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
golem_worker_api-1  |     await app(scope, receive, sender)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 758, in __call__
golem_worker_api-1  |     await self.middleware_stack(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 778, in app
golem_worker_api-1  |     await route.handle(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 299, in handle
golem_worker_api-1  |     await self.app(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 79, in app
golem_worker_api-1  |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
golem_worker_api-1  |     raise exc
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
golem_worker_api-1  |     await app(scope, receive, sender)
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 74, in app
golem_worker_api-1  |     response = await func(request)
golem_worker_api-1  |                ^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 299, in app
golem_worker_api-1  |     raise e
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 294, in app
golem_worker_api-1  |     raw_response = await run_endpoint_function(
golem_worker_api-1  |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
golem_worker_api-1  |     return await dependant.call(**values)
golem_worker_api-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/code/workers/routes.py", line 79, in delete_worker
golem_worker_api-1  |     output_dto = await DeleteWorker.execute(input_dto)
golem_worker_api-1  |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/code/workers/use_cases/delete_worker.py", line 33, in execute
golem_worker_api-1  |     await agreement.terminate()
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/golem/resources/base.py", line 60, in wrapper
golem_worker_api-1  |     return await f(*args, **kwargs)
golem_worker_api-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/golem/resources/agreement/agreement.py", line 89, in terminate
golem_worker_api-1  |     await self.api.terminate_agreement(self.id, request_body={"message": reason})
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/ya_market/api_client.py", line 205, in __call_api
golem_worker_api-1  |     raise e
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/ya_market/api_client.py", line 193, in __call_api
golem_worker_api-1  |     response_data = await self.request(
golem_worker_api-1  |                     ^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/ya_market/rest.py", line 268, in POST
golem_worker_api-1  |     return await self.request(
golem_worker_api-1  |            ^^^^^^^^^^^^^^^^^^^
golem_worker_api-1  |   File "/usr/local/lib/python3.11/site-packages/ya_market/rest.py", line 180, in request
golem_worker_api-1  |     raise ApiException(http_resp=r)
golem_worker_api-1  | ya_market.exceptions.ApiException: (500)
golem_worker_api-1  | Reason: Internal Server Error
golem_worker_api-1  | HTTP response headers: <CIMultiDictProxy('Content-Length': '236', 'Vary': 'Origin, Access-Control-Request-Method, Access-Control-Request-Headers', 'Content-Type': 'application/json', 'Date': 'Wed, 28 Feb 2024 12:22:08 GMT')>
golem_worker_api-1  | HTTP response body: {"message":"Protocol error while terminating: Terminate Agreement [R-da5eb50ede2744456bdc551358ee7afd4b16ea50315ac7b6d91a5d343e00e1c8] GSB error: GSB failure: Net: error forwarding message: Establishing Tcp connection: Socket closed.."}

In yagna logs I keep seeing:

[2024-02-28T09:06:51.715+0000 ERROR ya_market::protocol::discovery] Error broadcasting offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
[2024-02-28T09:06:52.547+0000 ERROR ya_market::protocol::discovery] Error broadcasting unsubscribed offers: GSB failure: Broadcast failed: Unable to query neighbors: Request failed: Request timed out after 3000 ms
stan7123 commented 8 months ago

Wrong repository