TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
932 stars 301 forks source link

Ttn-lw-cli applications subscribe HANG #5534

Open V999TEC opened 2 years ago

V999TEC commented 2 years ago

Summary

When I execute ttn-lw-cli events --gateway_id [whatever] the cli will (as expected) run until the access token expires and returns to the caller.

When I’m not authenicated I can just do ttn-lw-cli login to get a new access token. That all works fine :smiley:

However, when I execute ttn-lw-cli applications subscribe --application-id [whatever] the cli (unexpectedly) will not return after the access token has expired :frowning:

Steps to Reproduce

The symptom I see is that I get the desired application output data while the access token is valid but no clue that the token has expired. Visually it is as if the device had stopped sending data to the application, but it has been broadcasting fine, as witnessed by the gateway events.

I can use --log.level debug to get an indication of when the token will expire, along the lines of DEBUG Using access token (valid until 8:06AM)

… but I would expect the ttn-lw-cli to actually terminate at 8:06AM and not just hang!

What do you see now?

I feel ths behaviour is inconsistent with ttn-lw-cli events which sympathetically returns when the token expires.

What do you want to see instead?

If I have a script wrapping ttn-lw-cli I don't want it to hang and never return. Once the access token has expired there is no further data retuned from the subscription. However this data stream stops silently :frowning:

Environment

The Things Network Command-line Interface: ttn-lw-cli Version: 3.19.2 Build date: 2022-05-25T07:14:30Z Git commit: 2dfab4a75 Go version: go1.18.2 OS/Arch: windows/amd64

...

How do you propose to implement this?

I'm not in a position to fix it myself. I currently have a workaround that has another process that monitors the script running ttn-lw-cli When the token has expired I kill the hung ttn-lw-cli I determine the expiry timestamp based on scanning the debug output

...

How do you propose to test this?

It should be simple to determine if the ttn-lw-cli unblocks after (typically) 1 hour allowing the wrapper script to continue. All being well I would not need for another process to kill ttn-lw-cli

...

Can you do this yourself and submit a Pull Request?

I'm not in a position to fix it myself.

happyRip commented 1 year ago

Are You able to consistently reproduce the encountered issue or does it only happens occasionally?

V999TEC commented 1 year ago

I'll run a test again on 3.21.1 and get back to you

V999TEC commented 1 year ago

I can confim that this is still an issue. At the expiry of the access token, the ttn-lw-cli does not return to the caller

{"level":"debug","msg":"Using access token (valid until 5:22PM)"}
{"level":"debug","msg":"Connecting to gRPC server...","target":"eu1.cloud.thethings.network:8884"}
{"level":"debug","msg":"[core]parsed scheme: \"\""}
{"level":"debug","msg":"[core]scheme \"\" not registered, fallback to default scheme"}
{"level":"debug","msg":"[core]ccResolverWrapper: sending update to cc: {[{eu1.cloud.thethings.network:8884  <nil> 0 <nil>}] <nil> <nil>}"}
{"level":"debug","msg":"[core]ClientConn switching balancer to \"pick_first\""}
{"level":"debug","msg":"[core]Channel switches to new LB policy \"pick_first\""}
{"level":"debug","msg":"[core]Subchannel Connectivity change to CONNECTING"}
{"level":"debug","msg":"[core]Subchannel picks a new address \"eu1.cloud.thethings.network:8884\" to connect"}
{"level":"debug","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc000a92960, {CONNECTING <nil>}"}
{"level":"debug","msg":"[core]Channel Connectivity change to CONNECTING"}
{"level":"debug","msg":"[core]CPU time info is unavailable on non-linux or appengine environment."}
{"level":"debug","msg":"No new version available"}
{"level":"debug","msg":"[core]Subchannel Connectivity change to READY"}
{"level":"debug","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc000a92960, {READY <nil>}"}
{"level":"debug","msg":"[core]Channel Connectivity change to READY"}
{"level":"debug","msg":"Connected to gRPC server","duration":0.2976,"target":"eu1.cloud.thethings.network:8884"}

In the example above after 5:22 pm the script that calls ttn-lw-cli will hang. Events related to the application will be lost. Alternatively, if authentication via --api-key is passed to ttn-lw-cli login (instead of relying upon the browser OAUTH mechanism) then the access token is valid indefinitely and no application events are lost.

The test I have just repeated is using the following version:

The Things Network Command-line Interface: ttn-lw-cli
Version:             3.21.1
Build date:          2022-08-24T16:05:59Z
Git commit:          6cf0efd63
Go version:          go1.18.5
OS/Arch:             windows/amd64
happyRip commented 1 year ago

Issue closed automatically despite being referenced only #5887