project-chip / rs-matter

Rust implementation of the Matter protocol. Status: Experimental
Apache License 2.0
310 stars 43 forks source link

#102, #103 #107

Closed ivmarkov closed 7 months ago

ivmarkov commented 11 months ago

@kedars Really weird - the (latest?) Google Matter Controller fails provisioning if querying attribute UniqueId from the Basic Info cluster fails.

It is weird because in the Matter 1.0 spec, the attribute is marked as "optional". Perhaps they made it mandatory in Matter - 1.1? (Need to check this in the latest spec.)

This PR also addresses #103 at least to some extent, by reporting with a warn level all cluster reads/writes that return a non-OK status.

ivmarkov commented 11 months ago

@kedars Approval is not enough you need to merge it too, I guess. :)

Also short update - the UniqueId attr is not mandatory also in latest Matter 1.1 spec. Well I guess we still need a fix, as provisioning fails otherwise.

kedars commented 11 months ago

@ivmarkov that is why I haven’t merged :-).

It seems there are devices that can get commissioned without this attribute too, needs some further investigation…

ivmarkov commented 11 months ago

Hmm, you mean you can still provision with a Google Controller even without this fix?...

ivmarkov commented 11 months ago

... on a second thought, we might be returning the wrong error / an error where we should be silent. Let me double check that...

jasta commented 11 months ago

I'm trying to help how I can and reading through the spec as well as the official source repo, I think the requirement for optional fields is to respond with the default value, as quoted from matter 1.1 core spec, "7.17.2. Optional or Deprecated"

If the specification text of a cluster depends on the value of an optional or deprecated data field of the same cluster, then the data field SHALL have a well-defined default value that SHALL be used when the data field is not implemented.

In this case, UniqueId (and looks like we're missing quite a few with this interpretation of the spec), should return the default of manufacturer specified (MS), which should mean undefined! The official matter repo encodes UniqueId and many other missing fields using an empty string, but technically providing UNSUPPORTED_ATTRIBUTE should meet the definition for manufacturer specified as well. Unfortunately it means a client like Google Home would be equally correct in rejecting this value. Boooo.

What is puzzling though is what is the difference between mandatory and optional given that both seem to have default values selected. It seems like the specification is confusing spec and implementation, that is, a mandatory field with a default requires that the API be built in such a way that it will fail if the value is not user provided. Worse, the way that section "7.17.4. Default Column" is worded doesn't specify if the manufacturer specific behaviour is meant to be client side or server side. That is, is the server required to provide some default value if the user didn't give us one, or is the client supposed to treat UNSUPPORTED_ATTRIBUTE as if it had been the default value the client would have used? This would explain why Google Home isn't working as expected here, they probably have their own custom implementation and "differently understood" this part of the spec...

jasta commented 11 months ago

Curiously the default implementation for the basic info cluster just encodes nothing and returns CHIP_NO_ERROR when the attribute is not known (https://github.com/project-chip/connectedhomeip/blob/master/src/app/clusters/basic-information/basic-information.cpp#L80). That seems...fishy.

ivmarkov commented 11 months ago

TBH, I'm having difficulties parsing this paragraph you quoted:

If the specification text of a cluster depends on the value of an optional or deprecated data field of the same cluster

What do they even mean by "specification text of a cluster"?

The official matter repo encodes UniqueId and many other missing fields using an empty string, but technically providing UNSUPPORTED_ATTRIBUTE should meet the definition for manufacturer specified as well

I'm not so sure. Unsupported_attribute is an error (non-ok status), not really a value, where by value I understand really something which is part of the attribute domain (as in an empty string).

jasta commented 11 months ago

Frankly, I don't know if we're going to get any resolution on this confusion without posting an issue to Matter upstream since the documentation is just clearly ambiguous on this topic. Hopefully I'm not jumping the gun here but I propose we resolve the intermediate issue as follows:

  1. Submit clarification request to Matter upstream (I can tackle this, just lmk)
  2. Copy Matter upstream implementation (i.e. use empty strings for spec-defined fields, but also default to a successful reply with a missing value field for reads of unsupported attributes) for at least the BasicInfoCluster (though they seem to do this for all clusters AFAICT!). I've done this locally and it seems to work but more testing is needed before I can submit a PR.
  3. After (1) resolves, adjust implementation to match the intention of the standard iff Google Home still works with this new interpretation.
  4. If Google Home doesn't work with the clarified solution, file an upstream bug with them referencing the answer we get in (3) as evidence of a faulty implementation. Wait until they respond to make changes to matter-rs.

Thoughts?

jasta commented 11 months ago

Correction: I had this wrong. I misread upstream matter code (and rs-matter, actually). I didn't realize that the attributes are filtered before reaching the handler. In rs-matter, the ErrorCode::AttributeNotFound that's yielded from the handler feels wrong to me, it's more like a severe internal configuration error indicating the CLUSTER has an attribute defined that the Attributes enum doesn't contain, most likely caused by an id mismatch due to the manual copy/pasting that happens when adding attributes.

So it looks like for rs-matter today, an unknown attribute isn't sending UNSUPPORTED_ATTRIBUTE at all, it's ignoring the read request by filtering it out here:

https://github.com/project-chip/rs-matter/blob/e39fd18b73b77218e60594df4dea8359062c0dac/rs-matter/src/data_model/objects/node.rs#L125

I'm still checking what upstream matter's behaviour is in this case. That might provide a more general clue for how to work around this issue that doesn't require adding a ton of weird default values for optional attributes...

In either case, it's still quite unclear what the spec is trying to tell us about the case where we don't know about an attribute that's marked optional.

jasta commented 11 months ago

Update on matter upstream behaviour: I'm 99% sure that if the attribute is truly not known by the server, UNSUPPORTED_ATTRIBUTE will be yielded which is not the behaviour that rs-matter has today. Evidence of this is found here:

https://github.com/project-chip/connectedhomeip/blob/master/src/app/util/attribute-storage.cpp#L667C35-L667C35

This is the big lookup operation that uses the generated attributes data structure that came from the input .zap file.

Unfortunately, modifying rs-matter to use this behaviour (that is, yield UNSUPPORTED_ATTRIBUTE instead of ignoring the read request) still sees Google Home fail to add the device. I think we're kind of stuck here for the short term with such ambiguity in the spec. We have to just modify the behaviour to supply a bunch of default values for attributes defined in the spec.

ivmarkov commented 11 months ago

@jasta Are you in this room? If not you might want to join - I'm trying to schedule a meeting where you might want to participate.

ivmarkov commented 11 months ago

Correction: I had this wrong. I misread upstream matter code (and rs-matter, actually). I didn't realize that the attributes are filtered before reaching the handler. So it looks like for rs-matter today, an unknown attribute isn't sending UNSUPPORTED_ATTRIBUTE at all, it's ignoring the read request by filtering it out here:

https://github.com/project-chip/rs-matter/blob/e39fd18b73b77218e60594df4dea8359062c0dac/rs-matter/src/data_model/objects/node.rs#L125

You are looking at the wildcard code path (i.e. client wants to read a bunch of attributes). Only during wildcard reads we are silently dropping unknown attributes or attributes for which the client does not have permissions. Which is as per the spec.

For non-wildcard queries, we return errors (status errors, that is), as the spec mandates. Otherwise, I would not have even noticed that this read is failing.

In rs-matter, the ErrorCode::AttributeNotFound that's yielded from the handler feels wrong to me, it's more like a severe internal configuration error indicating the CLUSTER has an attribute defined that the Attributes enum doesn't contain, most likely caused by an id mismatch due to the manual copy/pasting that happens when adding attributes.

Not all errors are severe, and in general, most of the errors can be converted to IMStatusCodes and sent back. For example, AttributeNotFound is converted to IMStatusCode::UnsupportedAttriute here and then sent; it is that status code which I noticed in the output log. Note also that this status code generation does not even happen during cluster invocation but earlier - during the read request traversal, here which arrives here.

jasta commented 11 months ago

Woof, this protocol is a beast :) Thanks though, these clarifications help a ton. Btw, from the meeting notes I can confirm that simply supporting UniqueId by sending back an empty string works to be added to Google Home. Happy to have independent confirmation but that's the basis for my analysis that concludes that optional attributes really are meant to be known and supported but to return some default value.

ivmarkov commented 11 months ago

Woof, this protocol is a beast :) Thanks though, these clarifications help a ton.

Yes, it is the most complex protocol I have ever seen. And it is supposed to work on embedded, with just a few tens of kilos of RAM. :)

Btw, from the meeting notes I can confirm that simply supporting UniqueId by sending back an empty string works to be added to Google Home. Happy to have independent confirmation but that's the basis for my analysis that concludes that optional attributes really are meant to be known and supported but to return some default value.

What is still unclear (IMO) is whether the code in the Matter C++ SDK you stumbled upon returns an optional value for ALL unknown attributes - across all clusters - or for some unknown reason they implemented this defaulting hack just for the BasicInfo cluster.

kedars commented 11 months ago

I confirmed that the chip-tool read on BasicInfo::UniqueId indeed return UnsupportedAttribute if explicitly read; and correctly ignores the listing, if all attributes are read through a wildcard read.

ivmarkov commented 11 months ago

I confirmed that the chip-tool read on BasicInfo::UniqueId indeed return UnsupportedAttribute if explicitly read; and correctly ignores the listing, if all attributes are read through a wildcard read.

You mean when chiptool is reading from an rs-matter-based device impl, or when chip-tool is reading from a Matter C++ SDK-based device impl?

ivmarkov commented 11 months ago

... because we kind of already know (or hope to be the case) that matter-rs would do exactly what you confirmed. The big question was/is, what the Matter C++ SDK does.

ivmarkov commented 7 months ago

@kedars I don't know what had changed in the past months, but provisioning with a Google Controller works for me again without these changes. So I'll close this...