so, Agreement.close_all() sometimes hangs forever, and I don't know why and where.
Also, there is a wider topic to consider here. We discussed with @prekucki some time ago that agreement termination should just always succeed, provided that the request reached yagna (i.e. yagna should notice our termination request and ensure agreement was terminated asap). This is pretty important - we shouldn't be left with a hanging agreement because of some weird error (e.g. on the provider side). Disclaimer: I don't really know how this is now handled in yagna.
[2023-01-03T10:25:40.435+0100 ERROR ya_activity::error] Activity API server error: GSB error: Remote service at `/net/0x379c1f8c7f55929c7e5c491b08894159b8c96f15/activity/DestroyActivity` error: Bad request: endpoint address not found
in every iteration.
Edit 3: There might not be a perfect solution available here. There is just no way of ensuring that agreement was terminated (or activity destroyed) - provider might not be responding ever.
This is hard to replicate. Also, I've seen this only on the
public-beta
subnet.I know there is a problem because
yacat.py
sometimes leaves activities in "stopping" state forever, where state is changed here:so,
Agreement.close_all()
sometimes hangs forever, and I don't know why and where.Also, there is a wider topic to consider here. We discussed with @prekucki some time ago that agreement termination should just always succeed, provided that the request reached
yagna
(i.e.yagna
should notice our termination request and ensure agreement was terminated asap). This is pretty important - we shouldn't be left with a hanging agreement because of some weird error (e.g. on the provider side). Disclaimer: I don't really know how this is now handled inyagna
.Edit: a little more data in https://github.com/golemfactory/golem-core-python/issues/47 Edit 2: current solution (
Agreement.close_all()
) is probably not very good - we can be stuck forever in a loop whereyagna
saysin every iteration.
Edit 3: There might not be a perfect solution available here. There is just no way of ensuring that agreement was terminated (or activity destroyed) - provider might not be responding ever.