chip-tool: subscribe command exits immediately, it needs timed run option

project-chip / connectedhomeip

Matter (formerly Project CHIP) creates more connections between more objects, simplifying development for manufacturers and increasing compatibility for consumers, guided by the Connectivity Standards Alliance.

https://buildwithmatter.com

Apache License 2.0

7.56k stars 2.04k forks source link

chip-tool: subscribe command exits immediately, it needs timed run option #18800

Open bluebin14 opened 2 years ago

bluebin14 commented 2 years ago

Problem

chip-too subscribe command exits immediately, making it problematic to use in automated tests of all existing test plans that relate to subscription. Having an interactive mode does not solve the issue of automation.

Proposed Solution

Add option to run the tool for the duration specified. E.g. to run it for 30 seconds: ./chip-tool booleanstate subscribe state-value 1 10 30 12344321 1

bzbarsky-apple commented 2 years ago

"timeout" is how long the caller of the command will wait before killing it off if it has not complete.

The "subscribe" command, in non-interactive mode, completes immediately. This has nothing to do with the timeout argument.

Is the actual request here for the "subscribe" command to not complete immediately? What would it do in automation exactly? When would it complete?

@bluebin14

bluebin14 commented 2 years ago

@bzbarsky-apple yes the request is for the subscribe command to not complete immediately but instead wait for a specified duration and exit. Without that the subscribe command is meaningless in non interactive mode. Since the timeout argument already exists and means "wait for at most x duration before exiting", indeed a different argument would be more appropriate that means "wait for at least x duration before exiting". Note that chip-tool in TE8 already had the option of "do not exit" although you could not specify a run duration.

bzbarsky-apple commented 2 years ago

I believe @vivien-apple purposefully removed the "do not exit" bit when interactive mode was implemented.... It really did not work well.

That said, adding a "wait N seconds after the subscription is established before exiting" option for subscriptions might not be too difficult....

doublemis1 commented 2 years ago

Also before interactive mode, the subscription keeps chip tool process until SIGTERM, using the following command: ./chip-tool basic subscribe node-label 1 100 1 0 --keepSubscriptions 1 or ./chip-tool basic subscribe-event leave 1 100 1 0 --keepSubscriptions 1

Can we restore this non-interactive chip tool feature?

jnsgsr commented 2 years ago

This is impediment for us to get results from preSVE testing in timely matter. Blocks automation of significant amount of test cases. Please increase the priority for a resolution of this.

bzbarsky-apple commented 2 years ago

@jnsgsr Can you please describe how you would automate your test cases if subscribe did not quit immediately?

And can you not achieve the same effect by writing data to the stdin of chip-tool interactive start from your automation?

The reason this functionality was removed is that it was fundamentally pretty broken. If we're going to try to restore it, we need to understand how it would get used in practice, so we are sure it supports those uses, as well as whether those use cases are in fact unaddressed by interactive mode.

jnsgsr commented 2 years ago

@bzbarsky-apple We have several tests automated that are currently disabled because of this issue (some with high criticality) and it is blocking automation for many test cases where we have to test if an accessory sends events. We were using it as intended, spawning a chip-tool process with subscribe or subscribe-event commands to work in the background and gather events, and we parse chip-tool output with specific patterns to get event parameters and to verify those in following test steps. Interactive mode does not suit here, there is no user interaction required beside calling the initial command. For any other command I would just open another shell with chip-tool to have clear output from both commands. When using interactive mode and after calling subscribe-event, between lines of output there is a prompt i.e. >>> and a bunch of redundant bytes around it, and there is no strict pattern as there are some lines that do not have those. This makes it more problematic to automate parsing. If you decide not to fix it, let us know right away, as in our current plan we assumed we can wait for this to be fixed, and just re-enable the tests.

bluebin14 commented 2 years ago

stdin input handling could indeed use some tweaks, commands randomly either do not show up (all of them should be clearly echoed) or show up on every line even though issued exactly once.

>>> quit()    [1654586300470] [54089:614038] CHIP: [CTL] Shutting down the controller
>>> quit()    [1654586300470] [54089:614038] CHIP: [CTL] Shutting down the commissioner
>>> quit()    [1654586300470] [54089:614038] CHIP: [CTL] Shutting down the controller
>>> quit()    [1654586300470] [54089:614038] CHIP: [IN] Expiring all connections for fabric 1!!

vivien-apple commented 2 years ago

So if my understanding is correct, you are interested in gatherings events received by chip-tool and you have some log parsing scripts that are checking against your expected results ?

It is possible to do something like ?

$ mkfifo /tmp/chip-tool-stdin
$ nohup ./out/debug/standalone/chip-tool interactive start --trace_file /tmp/subscribe.interactive.trace > /tmp/interactive.log.txt < /tmp/chip-tool-stdin &
$ BACKGROUND_TASK_PID=$!
$ echo "onoff subscribe on-off 1 5 0x12344321 1" > /tmp/chip-tool-stdin
# Wait for the specific duration you are interested in
$ kill $BACKGROUND_TASK_PID
# Now there is a log of the messages that have been emitted/received by chip-tool in /tmp/subscribe.onoff.trace. The payload are base64 TLV bytes. To get an easier to understand version it can be converted with another command
$ chip-trace-decoder --source /tmp/subscribe.onoff.trace > /tmp/subscribe.onoff.decoded.trace
# run your script on the decoded version

There is a little issue with the trace-decoder-program and status report at the moment (those are not decoded, I'm going to fix that)... (update: I have opened #19307 for it)

bluebin14 commented 2 years ago

This approach can get very awkward when e.g. doing subscriptions from multiple controllers (let's say 5) then stopping controller 3 after 10 seconds etc. With timed run you have 5 lines and one script. With interactive mode... still counting.There is nothing wrong in having an interactive mode but maybe its usage should not be forced.

woody-apple commented 2 years ago

SDK Review: Removing sve label, @cjandhyala to follow up to seed if this is still an issue.

woody-apple commented 2 years ago

SDK Review: @bluebin14 can you confirm to see if, or if not there are any known issues blocking SVE, or testing at this point?

bluebin14 commented 2 years ago

@woody-apple there are no known issues blocking SVE, the non-interactive mode timed run option for subscription test cases is a feature request for better automation capability outside the test harness. The test harness can have its own solution for those cases using interactive mode (not tested though).

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.