home-assistant / architecture

Repo to discuss Home Assistant architecture
315 stars 99 forks source link

Allow contributors to (optionally) add YAML configuration on device Integrations #399

Closed nitobuendia closed 3 years ago

nitobuendia commented 4 years ago

Context

After ADR-0010, which introduces a forced deprecation of YAML for integrations which interacts with devices, a lot of critical user flows are broken:

For more details and detailed explanations on these issues, you can check this entry.

Proposal

Set UI as the mandatory configuration, but allow any Contributors to (optionally) add YAML configuration support if desired on Core components.

Consequences

Jc2k commented 4 years ago

Config flow originally existed for devices that cannot be easily configured in yaml. Devices that aren't static or are awkward to configure manually. For example, where you need to perform an oauth flow. Or home homekit_controller needs to a 6 step cryptographic exchange. How do you see that working in YAML?

I think this approach will be annoying for end users when a maintainer doesn't have capacity or willingness to implement yaml support. I think it will be distressing for volunteers when a user gets upset on a ticket (or angry and aggressive and threatening). And it might be confusing when triaging tickets - is the config flow in play or is this an integration that implements yaml as well?

One of the things that was touched on in https://github.com/home-assistant/architecture/issues/143 and https://github.com/home-assistant/architecture/issues/370 was the possibility of using the very same API's that the front end uses and provide a CLI tool that could apply yaml configuration via those API's. In those tickets i suggested something that worked a bit like kubernetes manifests and kubectl apply.

It seems like the community could even build a tool like this without needing a large ADR to pass, and many if not all of the needed API's must exist for the front end to make these devices work in the first place.

This would allow you to put your manifests in Git, to back them up, to share them. Different sections could be restored individually. The manifests could contain comments. You could run a sed over all your manifests and apply them with your cli tool to do bulk updates.

There are some compelling advantages to this CLI approach too:

nitobuendia commented 4 years ago

Hi @Jc2k, thanks for your answer and time.

First of all, my proposal here is to simply approve support YAML configuration. I am not trying to recommend in what way it should be done. For that we can open other issues. CLI is one option (I will reply to it soon). This one could be another one.

Config flow originally existed for devices that cannot be easily configured in yaml. Devices that aren't static or are awkward to configure manually. For example, where you need to perform an oauth flow. Or home homekit_controller needs to a 6 step cryptographic exchange. How do you see that working in YAML?

UI is a good way to generate those. Once generated, they can be stored back in YAML. See this proposal.

I think this approach will be annoying for end users when a maintainer doesn't have capacity or willingness to implement yaml support.

It is annoying. This is why we are complaining. My position is that having some is better than having none. Also, if the maintainer doesn't have capacity, maybe other contributors will. That's how Open Source is meant to work. Forbidding it will still annoy the end users, and also some maintainers who may want to support it.

I think it will be distressing for volunteers when a user gets upset on a ticket (or angry and aggressive and threatening).

People should be allowed to file feature requests. People threatening should be blocked. We should not take product decisions based on threats. I think the position here was that by making it forbidden officially people cannot threat. 1) They may threat for other reasons and 2) this position still states that YAML is not a requirement.

And it might be confusing when triaging tickets - is the config flow in play or is this an integration that implements yaml as well?

How was dealt with until now? One simple idea: can that be added to the bug request template? That's how other confusions are handled (like which installation method, etc).

In addition, not having YAML will also harm troubleshooting. You can read about it on this section.

This would allow you to put your manifests in Git, to back them up, to share them. Different sections could be restored individually. The manifests could contain comments. You could run a sed over all your manifests and apply them with your cli tool to do bulk updates.

This sounds like a fair solution. If I can keep my configuration.yaml and a system takes care of translating it to the format Home-Assistant needs, that's what we are asking.

As stated before, I am not proposing using method A or B, I am proposing to allow YAML to exist for all integrations. The HOW can be discussed later. CLI might be a good solution.

However, I do not think all would be solved:

  1. If CLI needs to be manually run. All configurations will still eventually break. If you acknowledge that there's a way to translate configuration.yaml into entries via the same APIs: why drop YAML support? Make this part of core, run it at start up and translate configuration.yaml into .storage.

  2. Reduced documentation for users and developers. If YAML is not officially supported, what's the schema of the YAML that should be followed? How does the CLI know what to translate into what? In other words, how does the user know what to put in all parameters?

Jc2k commented 4 years ago

The idea is that the CLI is manually run, and is like using the UI but on the command line. Remember this is using the same REST API as the UI. It is in fact a UI. This is interesting because:

I don't agree with (2). It might be that people don't write good enough documentation any more but there isn't a technical reason for bringing back yaml. Maybe there needs to be more requirements when writing docs?

Anyway, if there was a community project or official project to make a yaml tool it would obviously need documentation. That documentation needs writing for either approach.

In the CLI case remember that there is a web UI for this. It's already programmatic. What's to stop the CLI leveraging it too? E.g. It's possible that we can use the (translated) strings.json and the validation data already in the configflow forms to provide a schema.

nitobuendia commented 4 years ago

Hi @Jc2k

Is there an issue request for this CLI approach? If yes, could you link it so we can continue discussion there? If not, do you mind creating it?

Once again, here I am just proposing supporting YAML, not the how. One of the rules is "not (to) go off-topic, (and) create a new topic instead". I am very interested in debating this. (I actually have the answer to your message prepared).

Jc2k commented 4 years ago

No there isn't - the CLI idea doesn't require an ADR or even a HA issue as it's built using existing API's and can exist externally to the HA project. Whilst i did start to prototype something along these lines, it's not something i'm strongly intrested in. I am interested in empowering the people who keenly support a more traditional configuration experience, I don't have the capacity to drive it.

I think we are speaking at cross purposes, though. I don't consider that I am arguing about the how at all. Such a CLI could be built regardless of the outcome of this ADR, and even if your request was implemented. Instead, as i see it, this issue is a request to weaken ADR-0010. I strongly disagree with weakening the language in ADR-0010. I brought up the CLI not because I want to support YAML, and not because I want to argue about how to support YAML, I brought it up because it's an example of how most of your objectives could be met without weakening ADR-0010. There may be other ways, and thats fine.

I also think an approach that uses HA API's is an example of a model where many integrations could be supported with no changes to the integrations themselves.

Do you think there are any risks or challenges or things made harder by your request? You only provided positives in "Consequences".

nitobuendia commented 4 years ago

Hi @Jc2k

Thanks for replying.

If I understand your position:

  1. You acknowledge that those use cases are currently broken because of ADR-0010.
  2. Despite that, you still think that ADR-0010 is the right decision and do not think that changes to it are needed.
  3. You think that Home-Assistant should not be fixing those issues, but the Community should instead.

Is this correct?

My answer to that is (1) I think it's technically feasible, but not viable for the Community to create such a utility without contributions or collaborations of the owners of the components; and (2) even if such utility is created, it would not solve all the current user flows. I've expanded and explained this on #401 and super happy to discuss there (although @balloob has already closed it without discussion, replies are possible).

I strongly disagree with weakening the language in ADR-0010

Why? What's the risk in doing so? And what are the benefits of doing so?

Do you think there are any risks or challenges or things made harder by your request? You only provided positives in "Consequences".

Thanks for asking. This is a very good question, and I can take it a bit more broadly.

First, I recognize that there are flows that are hard without UI; and the some configurations that are hard without YAML. I think we are all aligned on this as YAML is not deprecated from all the components.

Second, I recognize that UI has many benefits and it's a good design. However, I am also aware of many user flows that are broken and many user flows which became significantly harder (see first post). The same is true for YAML only. As such, both have limits and trade-offs and implementing only one of them would have impact on the Community.

Therefore, what I am proposing is a Hybrid approach where UI is still first (and allows to follow the direction that you have for the project); while allowing YAML for those use cases where it makes more sense. A Hybrid approach complements each other and has less risks, challenges and impact to the users. Not both will be used by everyone, but having both enriches the product and allows all users to work better with the product.

As such, what I take from your question is: what's the risk or challenge of maintaining both. I have touched a bit on the topic and counter argumentation on this post. Not to deviate the conversation, I think there's only two drawbacks to a hybrid approach:

  1. It could lead to higher cost of maintenance. My perspective here is that this does not need to be the case if we share as much infrastructure as possible. This is where proposal #400 is relevant, but there could be many other approaches, including your CLI #401 proposal. To make it even more so, this proposal is saying that this would be optional. Finally, I think making it easier for Developers while hard for Users is a bad approach to programming.

  2. It could lead to confusion if some components has and some do not. My understanding is that it will still be truth for components that do not interact with devices. In addition, this could be solve in 2 other ways:

    • UI by default for everyone; this makes it easy for newcomers that are not exposed to YAML. Only advanced users, who are aware of the ecosystem and want to have advanced controls would enable YAML and deal with the pros and cons that it carries.

    • Documentation. The integrations page has today whether components are Cloud or Local and Polling vs Pushing. The same could be done for configuration: UI, Discovery, YAML, Both, etc...

All in all, I think these risks are not significant enough and can be resolved easily and the cost of not doing so (explained on the first post) are significantly higher for the community.

Could I revert the question? What are the risks that you see of optionally adding YAML support?

thomasloven commented 4 years ago
  • All current configurations will eventually break, forcing all users to migrate their existing configurations as the UI-only configuration is rolled out to more integrations.

This is a good point, assuming yaml configurations never break, which - if history is any indication is not even remotely true. Otherwise it's completely moot.

  • No easy bulk addition or support. (e.g. How do I edit my devices IP after changing a subnet or usernames after a branding change or marriage?)

This is a good point, assuming you have enough different accounts that it's actually a lot of work to change them... every time you marry... and assuming your new username is not taken for any one of them already.

I have read your other examples of events where this might be relevant in the linked article, and they all boil down to that whenever you decide to restructure your entire setup, you should be able to do so without also putting a bit of work into your configuration(?) - all the while assuming that it would work immediately just because you type the names in yourself and that this isn't actually made easier by automated discovery.

Bulk addition, sure. But I don't think that's worth the trouble, and personally I strongly discourage adding lots and lots of devices at once without testing in between anyway.

  • No partial versioning of components. (e.g. How do I see my device configuration 1 month ago, as well as the reasons and dates for the changes thereafter?)

The type of integrations where yaml will not be an options are of the type were you only specify things like an ip-address and a username/password combination or an API key or a /dev device or so. Partial versioning would be useless, since it doesn't really make sense to go back on it anyway.

  • No flexible backups and restoring. (e.g. How do I restore my Media Players to 1 week ago?)

Discovery takes care of making sure the media players available on your network are also available in hass. For the few cases where automated discovery is not viable, it has been clearly communicated that exceptions can and will be made.

  • Reduced shareability for integrations and related entities. (e.g. How do I learn from other's configuration?)

The integrations where yaml will not be used are the ones where the setup is entirely unique for exactly your situation an noone elses.

Compare, for example, the hardly inspiring or least bit informative:

deconz:
  host: 127.3.5.278
  api_key: 880a5257dbc241b89e37798df60b9e59

With this quote places in a readme file

I'm using a few IKEA Trådfri-bulbs for my outdoors ligting. The ones at the front of my house are full-color ones, so I can set them according to the current holiday, but the ones at the back are the cheaper white ones (neighbors can't see them anyway).

Or you could add it as a comment in the automations and scripts using the things, together with the inspirational and creative parts.

As I see it, using other peoples integrations configuration for inspiration only makes sense in two cases:

  1. The documentation is unclear and you need to see how someone else set it up. Keeping yaml around is NOT the right solution to that problem.
  2. Seeing how someone has cleverly used node anchors or third-party-tools to work around and avoid the limitations caused by using yaml in the first place. Hardly a compelling argument.
  • Increased difficulty in troubleshooting. (e.g. How do I restore my system when a component has an issue an UI is not loading?)

Quite the opposite. By disconnecting integrations from the configuration file, you eliminate that as a potential single point of failure.

  • Reduced documentation for users and developers.

What? Why? If anything, the (more) self-documenting nature of the GUI configuration makes things easier, because there's less risk of the documentation becoming outdated.


For context. I gave up on Home Assistant entirely twice for two reasons - yaml and the constant restarting.

Then I got back to it, and absolutely loved it. I then became a huge yaml proponent.

I've written several articles on yaml and it's use in Home Assistant (1, 2). As soon as it was mentioned that yaml might be removed as an option for the lovelace configuration, I immediately created lovelace-gen as a third party tool to let those who wanted it keep it. I have made additions to the built-in yaml parser.

Then ADR-0010 came along, and was discussed - at length - among the frequent core contributors. All your points, and many many many more were raised during those discussions and by the end we came to an agreement that this was in fact the best for the future of Home Assistant and - ultimately - for the users.

Having worked with the source, and in the development of integrations, I am fully committed to this approach. The integrations where yaml will not be a configuration alternative are the ones where it does not make sense as a configuration alternative, and where adding it would mean an increased implementation and maintenance burden.

Yes, it does require some changed workflows. Yes, it will cause a few problems in the course of solving lots and lots of others. But in the end, it's just a question about your approach to a new technology and to potential challenges.

I've had problems with e.g. my deconz docker container. Every time I restarted it, it would get a new IP in the internal network, and since the deconz integration doesn't allow setting a host name, but instead auto-discovers an IP it would take a minute for it to find it again. I tried to fight it for a while, then I decided that maybe that was an indication that something wasn't quite right and tweaked my setup a bit to fix the address. As a result, my entire network got mor stable, and me and the deconz integration maintainer are discussing a better discovery solution which will also help others.


Also, I do believe in the idea of a cli utility or script to handle initial bulk setup of components, but I do not think that should be a core part of Home Assistant since that would undermine the vision of the project. Maybe you could also make it auto-install HACS and pull things from there? That would be sweet. But that's OT.

nitobuendia commented 4 years ago

Hi @thomasloven ,

First and foremost, thanks for putting the time in reading, understanding and replying. I think it's the first time that I feel that someone actually replies after fully reading and addressing the concerns; rather than just quickly closing or dismissing the request. Kudos!

I believe that some answers stick too much to the examples, rather than the overall issue that is being communicated. Do note that this may be due to my original post structure, where I may have omitted details compared to the longer version, and not your answer. However, let me address some of those points below, which maybe it helps to bring my point across a bit better in some areas.

I will divide the points by sections to make them easier to navigate.


On the decision process

Then ADR-0010 came along, and was discussed - at length - among the frequent core contributors. All your points, and many many many more were raised during those discussions and by the end we came to an agreement that this was in fact the best for the future of Home Assistant and - ultimately - for the users.

Honestly, I think this is part of the problem. We do not know what you have considered and what you have not, or why those considerations have been dismissed. Should this have all been done in public on this context (as other ADR posts are), there would not be people raising these concerns "again". For us, they were never raised or considered in the first place; or, if they were, the arguments provided felt insufficient.

Note that I am not saying that you should have consulted us, or anyone. Just saying that if those conversations were public, we could just read them and understand. The blog post (and podcast) do not feel enough to bring full transparency on the decision; but this is another topic.

Think of us as users who trust Home-Assistant (and you!) enough to put it in our homes, to control part of our lives. Then, you may understand how worrisome it is to imagine what other decisions might be being taken in "closed doors". But let's stay focused on the topic of discussion.

On the decision

The integrations where yaml will not be a configuration alternative are the ones where it does not make sense as a configuration alternative, and where adding it would mean an increased implementation and maintenance burden.

To be sure I understand this. On ADR-0010, this stated as follows:

Integrations that communicate with devices and/or services are configured via a config flow. These rules apply to all new integrations. Existing integrations that should not have a YAML configuration, are allowed and encouraged to implement a configuration flow and remove YAML support. Changes to existing YAML configuration for these same existing integrations, will no longer be accepted.

For me your wording and this wording do not match; unless you mean that YAML does not make sense in any integration that communicate with devices. If that's the case, I am struggling to understand what's the problem on configurations like this:

cast:
  media_player:
    # Google Home.
    - host: !secret google_home_ip
    - ignore_cec: !secret google_home_ip

What's the problem? Why does it not make sense? I can choose which Cast devices to add and which not. All my IPs are centralized, I can check or change them in bulk; or remove items easily. I have comments telling me which device is which and what set up I have (in this case, not much, in Hue or KNX, a bit more).

If I think for other set ups, where there's interaction with devices like KNX, I cannot imagine how nightmare-ish this would be to configure it in the UI.

On the trade-offs

Yes, it does require some changed workflows. Yes, it will cause a few problems in the course of solving lots and lots of others. But in the end, it's just a question about your approach to a new technology and to potential challenges.

I agree that there are trade-offs with both approaches. However, I am still trying to understand what's the biggest roadblock to support both. Not just YAML, which has a lot of limitations; but both, similar to Lovelace Dashboard.

More importantly, I do not understand why this is not empowering component owners to decide. YAML setup is still needed for other type of integrations where it makes sense. As such, the "core code" will still support the "YAML config flow" (not sure of a better name). Why not allow owners who want to go the extra-mile and spend their time to do it? Obviously, optionally and not required. This may also work in favour of the decision, if no one contributor wants to add it and support it; then it reinforces the point that UI is the way to go anyway.

In any case, I still didn't get a clear direct answer on why not to support both. However, so far the most highlighted reasons seem to be the cost of maintenance and some harassment for lacking YAML in certain components.

The first is what I am trying to figure it out and find solutions. I do not think it needs to be that way. Maybe there are solutions to the cost problem. I've proposed some (e.g. #400), and I am sure that more people could brainstorm on the actual challenges. However, beyond "cost of maintenance", there's no detail on what exactly is costly (is it documentation? is it the schema? is it the set up flow?).

The latter, I think it can still be handled since the direction is "YAML is not a requirement", and blocking people who harass and should not be welcomed. Definitely, a horrible thing that should be addressed, but not one upon which product decisions should be taken, in my opinion.

On the alternatives

Also, I do believe in the idea of a cli utility or script to handle initial bulk setup of components, but I do not think that should be a core part of Home Assistant since that would undermine the vision of the project.

I've opened a thread to discuss the CLI alternative (see #401). I've addressed my view there too. Of course, your view would be appreciated, if you want to add it and discuss more. (Although "discussing" might be not be accurate since the issue is already closed with a decision).

Other proposals and viable alternatives is what we are looking for.

On the breakage of integrations

All current configurations will eventually break, forcing all users to migrate their existing configurations as the UI-only configuration is rolled out to more integrations.

This is a good point, assuming yaml configurations never break, which - if history is any indication is not even remotely true. Otherwise it's completely moot.

I think I may not have expressed myself clearly here.

Yes, YAML breaks; and I expect that to happen. If you do not want it to break the system, use the UI only. YAML is not meant for everyone. The same with Lovelace dashboards. I can make changes and break the UI. If I do not want that, I can stick withe UI flow.

This is not my point at all. I am talking about posts like this one where the user configuration no longer works.

Now, re-listening to the podcast, I think I was wrong on this part of my statements. My understanding was that, if the YAML config is removed from a component, the people who has that component will have to re-configure it in the UI.

Apparently, we are doing migrations from YAML to entities in the first place. Is that correct? If yes, why are we not permanently doing this? If not, then the YAML-configured components that do not get migrated will break for all the users (i.e. forced migration for everyone).

On bulk editing

I have read your other examples of events where this might be relevant in the linked article, and they all boil down to that whenever you decide to restructure your entire setup.

That's mostly correct. It's bulk edit. It doesn't need to be the entire set up (e.g. not all my components use username or IPs, sticking with the same examples).

In any case, the IP and username changes is just a subset of all the potential bulk changes that you may want to do. I could keep adding use cases: renaming all my entities (because I want a new standard), moving houses, security issue and having to change passwords (although I understand that passwords are no longer stored; in any component at all?). I cannot think of all use cases, but they exist.

you should be able to do so without also putting a bit of work into your configuration(?)

Example: How do you suggest today that I find all the places where my username is X and I need to change it?

all the while assuming that it would work immediately just because you type the names in yourself and that this isn't actually made easier by automated discovery.

This would depend on how the automated discovery works. For example, if I change my PS4 network to a different IP; would it detect the new IP, understand that it is the same system, auto-update the infra without any feedback from my side? I have read the component and I didn't see code to handle that, but I might be wrong.

However, on my component (where I added YAML), I have host field, I can simply change it and since it's the same PS4, it uses the same PIN and Region, it will just work.

Also, automated discovery cannot get data that needs to be done by the user. An auto-discovery entity may pick up the new IP and work it out. However, what about the usernames?

On partial versioning and restoring

I am combining this two as, for me, they go a bit hand by hand in terms of the similarity of root cause.

The type of integrations where yaml will not be an options are of the type were you only specify things like an ip-address and a username/password combination or an API key or a /dev device or so.

That's not my understanding and I am probably wrong and would love to understand better. I've addressed it on the "On the decision" section and perhaps getting more clarity would help. Nonetheless, I do feel that a change of API key, or IP is still relevant to document and be able to revert.

Additionally, the versioning is not just about the set up itself, but also about which components are present.

Discovery takes care of making sure the media players available on your network are also available in hass.

This is solution for a different problem. This helps making sure that a component keeps working and the devices are set up easily and brought back. It does not solve partial versioning or restoring.

For example, imagine I am building my Automation system and I add 1-2 components per week over the period of a month in which I am working on this. At some point, the system starts being slow; I can check what components were added over this period and could be causing it. I can go back and restore (i.e. remove or disable components) without having to do a full backup that maybe will remove some automations, or template sensors that I added in the meantime.

In other words, when talking about versioning and restoring; it's not about knowing what API key or IP was there a week ago. It is about knowing exactly what components I had, what I changed, when I did it and why I did it. This way, I can restore the components to a week ago, except this new one component that I know is fully safe; while I keep my new automations, and scripts, which are not related to these components.

All this is taken away today. If my system breaks, or I start having issues, or conflicting components; I do not know what I changed (related to these storage entities), when I changed it and my only solution is a full restore to the past without knowing to when.

On shareability

The integrations where yaml will not be used are the ones where the setup is entirely unique for exactly your situation an noone elses.

Similar to the point before. It's not about knowing that your deconz API is 127.3.5.278. It's about knowing that you have deconz. Then searching what deconz is. Realizing that "wow! that's a cool one, I also want that!". Then buying my "ConBee", integrating it in my system and solving a problem.

It is also about me seeing an automation rule that looks interesting, but not having any visibility on where that binary_sensor that is being used comes from. If I had the YAML I can see that this comes from your whatever component and I can infer that it's a motion sensor device and thinking "wow! I could do the same on my living room".

The only way to do this today is if someone goes the extra mile to showcase and manually explain what systems they are using; which definitely will be way lesser than the people who will just simply "share" openly their configuration.yaml.

On troubleshooting

Increased difficulty in troubleshooting. (e.g. How do I restore my system when a component has an issue an UI is not loading?)

Quite the opposite. By disconnecting integrations from the configuration file, you eliminate that as a potential single point of failure.

You're right in that we would be eliminating issues coming from bad configurations. This is really good.

However, that's not all the issues that you may encounter. Let me expand on the nmap use cases a bit more. I've personally mentioned two on the argumentation. Other people would have others. As such, take this just as an example of the myriad of possibilities that can arise to a system that's UI only and can eventually have a bug.

nmap blocking start up.

I have this problem where one day after updating, my system UI would not start up. I didn't do any changes on my configuration at all. I checked all the breaking changes, none of them affected my components. Yet, I could not start my system.

While the bug is still open, this is most likely not a configuration issue. It seems a problem with either the component or my network set up, not the configuration. I think so, because (1) there were no configuration changes for a month before breaking, (2) it was just one minor version update with no breaking changes and (3) I had added a new switch about 1-2 weeks before.

What did I do to fix it? Binary search: I disabled half of my components and see if the system starts. If it did, the faulty component would be on that half. So I re-enabled it and disable half of that half. This way, I identified that nmap was blocking the start up. Therefore, I disabled it and reported an issue to the component owner.

If YAML was not an option, I would have been stuck with a broken system. I cannot disable components if my UI is not loading! My only way to fix it would be formatting the whole system and restoring a snapshot. The last snapshot would not work and I would need to restore to an older one, but that might take a few rounds as I have 0 knowledge on what's causing the issue.

Now, nmap might be one of those components with YAML, but this could happen in any other component in the future. Bugs happen. No matter how hard we test, they happen and this will happen again in the future with UI or YAML set up.

If my UI breaks, what's the expected workflow to solve problems?

On documentation

If anything, the (more) self-documenting nature of the GUI configuration makes things easier, because there's less risk of the documentation becoming outdated.

Yes, this is true. Indeed, one of my proposals based on experiments is to use the UI to actually do the flows. Trust me. I really think the UI is the best way to configure components for the first time. On this experiment, I actually use the UI to generate the data needed, and then spit out YAML or simply the schema values that you can put on your configuration.yaml (if that's what you want).

However, after that initial configuration; once you know what information you need, or once you just need to make changes, UI is slower.

We are not saying UI is bad. I love the UI. It really is progress and great. I am complaining about UI only!

Reduced documentation for users and developers.

What? Why?

Imagine for a moment, I want to create the CLI that we are discussing on #401 . I need to know what data I must pass for configuration. Today, on YAML this exists as schemas on the validation. Tomorrow, this would not be there.

Let me give an example for developers and an example for users.

For developers: PS4 component/device requires host/ip, region, pin (and name). How I am supposed to know this information? I am not talking about adding it, this will be guided through the process. It's in case I want to create a solution (e.g. CLI) to add this component.

It's not documented on the integration doc. The component has no top level schemas. The only place where this information might be inferred is from the data schema on the form. Or I may need to pick up from the device definition, but then realize that entry_id is also part of it (auto-generated obviously). I also have no sense of what those fields mean.

This may work for PS4. But then, you go to Hue component and that schema is not exactly there. (Obviously here is fine, because YAML is still supported if only encouraged to be removed).

And if we go to other components we may find other ways to understand the schema that needs to be passed. With YAML, all these fields were documented, there were clear schemas with optional and required parameters, etc.

For users this will be fine (actually, better), but for developers this documentation feels less clear. This applies both to create custom solutions (e.g. CLI) or contribute to the component.

For users: When setting up Hue via config_flow, it sets allow_hue_groups to False by default. After going through the flow, I was never asked about it. Moreover, I actually have no idea about what this parameter does.

Now, if today I read the documentation, I know exactly what it means and what it does. Moreover, I can change it and configure during my set up. UI might be simpler, but documentation seems worsening based on what I see.

there's less risk of the documentation becoming outdated.

Or, you define a schema, that works in the UI and can also work for YAML. Or you show the data to the user to confirm it. And then, you do not need to manually document (with the risk of becoming outdated); but also you are enabling YAML configurations preventing all the problems mentioned above.


Thanks for reading and taking the time!

Jc2k commented 4 years ago

Thank you for all the thought you have put into your answers.

First of all, please take this next bit with a grain of salt. Whilst I have been contributing to Home Assistant for a while and have contributed to the core config entries code I am currently short on time to write a response as throughly as you have. In particular, it's been a while since I worked with the legacy component setup entrypoints.

Putting aside your use cases for a second, I wanted to make sure we were all on the same page regarding what this request actaully means behind the scenes.

The preferred way for a component to be setup these days is via async_setup_entry. This takes a config entry as its second parameter. There isn't currently a config entry in the case of YAML configuration, so the older async_setup_platform method is used. I believe you are using setup_platform currently?

I imagine, though I must be very clear that I do not speak for the HA team, that the long term goal made possible by ADR-0010 is probably to remove the older entrypoints (or at least stop creating entities that aren't attached to a config flow from there) and simplify the code accordingly. AIUI if that were to happen the current method for supporting YAML configuration would not work. To my knoweldge this hasn't happened yet, but it's obviously one benefit of having dropped YAML configuration for this part of Home Assistant.

You may protest that this hasn't happened yet, so is not a good reason. So let's assume that supporting both entrypoints is free and there is no overhead for the devs in supporting them both going forward.

While they continue to exist these component setup methods are still not equivalent. If a config entry does not exist then the component opts out of some Home Assistant APIs, such as the device registry. And i believe some features of the entity registry are tied to the device registry. Not sure if the entity registry does partially work without the device registry?

In some cases it may be difficult for some integrations to share code between the 2 entrypoint functions. For example the async_forward_entry method which must be called for each entity type your component supports should only be called from the current async_setup_entry method. So we potentially end up with each component maintainer having to maintain 2 different entrypoints for their integration. Even if there can be some code sared, they both still need test coverage (unit and actually end to end testing).

Then the user will have a different experience depending on whether they are on teaml YAML or team UI.

There are probably others, I have not had time to look explicitly.

Some of the differences you likely don't care about youself. That's fine, but you may have trouble getting traction for this request with team members if thats the case.

I always do a pair/unpair test when working on my integration before submitting a PR. This proposal would double that test cycle for me and any other vigilant developer.

I think we would have to make it clear to users which features they were opting out of. As if the message in the team YAML community was that manual configuration was better, they would be surprised to find some functionality was different, disabled or missing.

I think one consideration the team will have is for the size of the test matrix. Hopefully you can see that because of the way the difference between these 2 entrypoints ripple between componets we do end up with a sizeable test matrix by supporting YAML via the existing mechanism. While this test matrix does exist today to some extent it is slowly being reduced as integrations are moved to config flow.

My personal opinion is that any attempt to allow YAML configuration to be available for setting up integrations must work by driving the config flow machinery so that there is only the async_setup_entry method and this split version of Home Assistant that we currently have must go away.

nitobuendia commented 4 years ago

Hi @Jc2k

I am currently short on time to write a response as throughly as you have

I, for one, really appreciate your comments and thoughts and do find them instructive. Thanks for spending the time.

regarding what this request actually means behind the scenes.

In general, the request is to simply optionally allow adding YAML. The how to do it best and how to support it, it's a different question. Different solutions might be best suited. My top proposal is #400, but would love to understand caveats or alternative approaches like the one you described (and I copied to #401).

The preferred way for a component to be setup these days is via async_setup_entry. This takes a config entry as its second parameter. There isn't currently a config entry in the case of YAML configuration, so the older async_setup_platform method is used. I believe you are using setup_platform currently?

As some people have pointed out, I have not contribute code to the core or the components there. Hence, I will reply from my experience playing with custom components and the experiments to add YAML support. As such, if something is wrong, I hope that someone from HA collaborators points it out.

My understanding is that setup_platform and async_setup_platform are the ways to implement YAML config entries. The latter being the async version of the first; and, therefore, preferred way (reference).

As such, this conversation is only async_setup_entry for UI entities vs async_setup_platform for YAML config.

I imagine, though I must be very clear that I do not speak for the HA team, that the long term goal made possible by ADR-0010 is probably to remove the older entrypoints (or at least stop creating entities that aren't attached to a config flow from there) and simplify the code accordingly. AIUI if that were to happen the current method for supporting YAML configuration would not work. To my knoweldge this hasn't happened yet, but it's obviously one benefit of having dropped YAML configuration for this part of Home Assistant.

I mostly agree, but that's not my understanding.

Deprecating YAML would make sense if the objective was simplifying the code base as well (i.e. remove async_setup_platform from the core). However, that's not what I understand that will happen in the short run.

My understanding is that (1) there are components that do not interact with devices and/or services which may still be configured via YAML, (2) for the ones which do, there might be exceptions made (unclear, but mentioned) and (3) there are custom components which are not required.

As such, it doesn't seem like the core logic will deprecate async_setup_platform. A clarification from HA "team" (core contributors?) would be appreciated.

This being said, this is one of the fears I have (and I think some do). In the past, we said that "YAML will not be deprecated". Then we said "YAML is not deprecated, but" (components with devices can't use it). The direction seems to be towards deprecating YAML from any place where UI is at par or above. However, this is speculation, but part of the problem and lack of transparency (debating in closed doors).

In some cases it may be difficult for some integrations to share code between the 2 entrypoint functions. For example the async_forward_entry method which must be called for each entity type your component supports should only be called from the current async_setup_entry method.

Could you elaborate on this and add an actual example? I would love to learn more of those and how we can address them.

So we potentially end up with each component maintainer having to maintain 2 different entrypoints for their integration.

This is where I think we need better solutions. Cast is actually doing that today; and I was able to replicate it on PS4 as well. This has been expanded on #400.

Even if there can be some code shared, they both still need test coverage (unit and actually end to end testing).

Unit Tests should be self contained. If the only functionality of async_setup_platform is to structure the data and call async_setup_entry, then you should only test that async_setup_entry is called with the right parameters. You should not be testing that the other party is doing its job. This is a super minimal check. The Unit Tests of async_setup_entry would already apply to async_setup_platform.

If this is not the case, and both async_setup_platform and async_setup_entry rely on the same internal function and flow; then, yes, you need to have exactly the same test twice. However, one would be a copy of the other and only the input data would change.

Alternatively, you could make sure that async_setup_platform and async_setup_entry both call your_helper_config_flow with the right data structure, and then simply test your_helper_config_flow. However, testing private methods is not a good unit test practice and I would not encourage to do this unless it's the last resort. Private methods should be tested through public ones, which would be async_setup_platform and async_setup_entry.

PS: I am basing this on my experience as a developer and as best practices that we use. Rules may change from company to company or project to project, but some of these Best Practices apply throughout.

Then the user will have a different experience depending on whether they are on teaml YAML or team UI.

My question would be: can we avoid that in a way that it's not costly for the Developers or confusing for the Users? I hope the answer is yes.

As the device registry is unavailable some components will provide a degraded experience. E.g. HomeKit will export less information about devices to iOS devices as it cannot fetch it a device is configured using YAML. Another example is HomeKit won't automatically link a battery and a battery charging status to an entity when a device is configured in YAML, even though it would be able to do it automatically for the config flow case.

I am not familiar with HomeKit integration. Why cannot it be fetched? Also, could this work like Hue + YAML today?

HomeKit is storing certain data on JSON format, what's the data that cannot be stored on YAML and passed the same way? If it's dynamic data, I think it's fine to store that in other ways (i.e. .storage) as kinda happened in the past with tokens and similar information. At least the static information is there.

The device registry won't be available in the UI for YAML integrations.

Is the device registry: Configuration > Devices? If yes, all my components today are YAML configured, and I can see all of them in the device registry.

Screenshot 2020-06-23 at 19 42 11

Am I missing something? For me, this would be a minor issue compared to the broken user flows. If you are an Advanced User and decide to go with YAML, you should not expect to be able to manage your integrations and devices in the UI. If that's your preference, use UI.

Device automations won't be available in the UI for YAML integrations.

The users has opted-in for YAML. They should be using YMAL for the automations.

This being said, same as with the previous point, I see all my devices in Automations.

Screenshot 2020-06-23 at 19 45 29

Moreover, I can also see my PS4 device. This PS4 device comes from this custom_component where I literally call async_setup_entry from async_setup_platform:


async def async_setup_platform(
        hass, config, async_add_entities, discovery_info=None):
  """Loads configuration and delegates to official integration."""
  # Load configuration.yaml
  host = config.get(const.CONF_HOST)
  name = config.get(const.CONF_NAME, const.DEFAULT_NAME)
  region = config.get(const.CONF_REGION)
  token = config.get(const.CONF_TOKEN)

  # Format it in the new config_entry.yaml format
  config_entry = Config(
      util.slugify(name),
      {
          const.CONF_TOKEN: token,
          const.CONF_DEVICES: [
              {
                  const.CONF_HOST: host,
                  const.CONF_NAME: name,
                  const.CONF_REGION: region,
              },
          ],
      }
  )

  await ps4_media_player.async_setup_entry(
      hass, config_entry, async_add_entities)

  return True
Screenshot 2020-06-23 at 19 46 11

Am I missing something?

You don't get to see the list of integrations you have configured in the integrations list.

Checked. All my integrations are configured via YAML, and I see them all in Integrations. I do not see the PS4 one, though, so maybe I am missing some setup, but it seems feasible as it's happening now.

UI features like "Related" won't be able to use the device registry to find related devices.

I am not familiar with this feature. Could you kindly elaborate? In any case, I think it would be the same: if you're configuring things via YAML, I think you're no expected to benefit from UI.

I always do a pair/unpair test when working on my integration before submitting a PR. This proposal would double that test cycle for me and any other vigilant developer.

Fair. Unit tests are automated, so there's no difference there. Integration tests is a different story. Not sure what would be a good solutions to Integration Tests, but happy to discuss.

I think we would have to make it clear to users which features they were opting out of. As if the message in the team YAML community was that manual configuration was better, they would be surprised to find some functionality was different, disabled or missing.

All in for this. Similar to Lovelace Dashboards. If you want easy and UI, use UI configuration. If you want to use YAML, this is what you're opting in for.

I think one consideration the team will have is for the size of the test matrix. Hopefully you can see that because of the way the difference between these 2 entrypoints ripple between componets we do end up with a sizeable test matrix by supporting YAML via the existing mechanism. While this test matrix does exist today to some extent it is slowly being reduced as integrations are moved to config flow.

What do you mean by "test matrix"? We've touched on Unit Tests and Integration Tests; is this something different?

My personal opinion is that any attempt to allow YAML configuration to be available for setting up integrations must work by driving the config flow machinery so that there is only the async_setup_entry method and this split version of Home Assistant that we currently have must go away.

Aligned. While this is about the if, and not the what, in an ideal scenario, for me this would work like this:

For users

For developers

This is what I was trying to accomplish on #400.

Jc2k commented 4 years ago

I’m away from the computer today - wedding anniversary trumps GitHub exchanges I’m afraid, as much as I enjoy your replies.

You are very closing to my starting point. Remember I’m not trying to debate you about this. I’m trying to make sure we have the same understanding of the config entry machinery to facilitate a solution. I am not trying to win internet points, I’m trying to convince myself we both understand the relevant HA machinery adequately and if you don’t fill in the gaps. You’ve spent a lot of time thinking about this so I’m not opposed to learning a thing or too.

I think if you have another little look you’ll get it this time. If not I can try and improve my explanation in a few days. I don’t have time to go into detail about it but the 2 things I can say are:

The big clue is that most show on the integrations page. That page shows config flows. This implies that that they the yaml is parsed into a config entry. Remember I’m trying to make sure that we are talking about the same thing. An entity created directly from setup_platform does not have a config flow. That is the old behaviour. But if it instead creates a config entry then the entity is created from the config flow machinery. That means it gets all the APIs I mentioned. That is a very different thing and we can talk about that after we have finished this discussion.

Apologies for using confusing terms like a test matrix. At a previous job I hung out with a really good QA team socially. Testing is a multi dimensional matrix and you need to consider requests that cause an explosion in the amount of testing that you need to do. If your request is implemented badly it will. If it leverages the config flow API the adds very little extra testing.

OnFreund commented 4 years ago

Sorry to interject here and not opine on the bigger topic at hand, but just wanted to clear what I think is a misunderstanding - @nitobuendia I think what you're seeing in your environment is a result of:

nitobuendia commented 4 years ago

Hi @OnFreund

Thanks for the input. Aligned on both:

Some of your YAML configured integrations have been ported over to config entries (e.g. Harmony). While you still have the YAML, I believe it's ignored, and changes to it won't be picked up (unless you change the host, in which case a new entry will be created but the old one will still be around).

I think this is the case. If we have a porting mechanism, can it be reused?

You also seem to be right on the changes. I ran an experiment with Harmony and delay_secs param. Integration had 0.4, which is the default value. Changed YAML to delay_secs: 1 and UI did not update (still shows 0.4). So either the UI is not reflecting the change or (obviously more likely), as you stated, the YAML changes are being ignored. I wasn't aware of this.

Unrelated, but: is this working as intended? Harmony still supports YAML as per documentation and code. I also have not seen any notifications on my system that this is deprecated or the entity has been migrated (I could have missed them). This does not feel as working as intended. What am I missing?

With the PS4 integration, you're passing a ConfigEntry-like object to async_setup_entry. While this works in your case, there's no guarantee that it will continue to, or that it'll work in other cases (for example, if the code will try to update the config entry, etc...)

Completely aligned on that. That's an experiment to showcase that it's technically feasible. However, making this in a way that makes sense and will work (not break), it's more the realm of #400 .


Hi @Jc2k

Apologies for using confusing terms like a test matrix. At a previous job I hung out with a really good QA team socially. Testing is a multi dimensional matrix and you need to consider requests that cause an explosion in the amount of testing that you need to do. If your request is implemented badly it will. If it leverages the config flow API the adds very little extra testing.

We've touched on Unit Tests and how to keep them under control on the previous message.

The big clue is that most show on the integrations page. That page shows config flows. This implies that that they the yaml is parsed into a config entry. Remember I’m trying to make sure that we are talking about the same thing. An entity created directly from setup_platform does not have a config flow. That is the old behaviour.

... Clue? I understood as much. My questions were on the HomeKit use cases.

But if it instead creates a config entry then the entity is created from the config flow machinery. That means it gets all the APIs I mentioned.

This similar to what I am proposing in #400. Parse YAML and pass it to the current entry points (API-like, but using the methods directly). Final result is the same, way to reach it slightly different.

I know your proposal is using a CLI #401. I've opened it for discussions, but because I've opened without your consent you were not willing to engage. In any case, my summary reply is that this not work because (1) the approach will not work if UI breaks (which is one of the concerns to be able to disable components); (2) the parameters to pass to each component are not documented (for non-YAML entries) and (3) since you are suggesting this as a Community effort, since it's not an official solution, that data can change at any point breaking the CLI without notification.

I think if you have another little look you’ll get it this time.

Or you can wait until you have time and actually explain what your point is or where you think I am not discussing (ref).

Jc2k commented 4 years ago

Hi i do not know when I will have sufficient free time to ever reply with the same amount as care and depth as you (and I must thank you for your continued patience). It's unlikely to be in the next few weeks. But I do understand that this topic is very important to you and I don't want it to stall because I am too busy gardening with my family.

I do think we are still at cross purposes, and thats probably hampering our discourse somewhat. So I will try to be explicit: If at all possible, please for the time being ignore my initial objection and please ignore #401. I understand that you may be keen to explore these, but at the moment they are beside the point. I am not currently trying to debate you. I am not trying to win internet points. Right now I am trying to act in good faith ref to help you present your idea in the depth that will likely be required to be convincing. I simply want to explore your idea in full collabaratively. Like any sufficiently senior developer I can put my misgivings to one side and look at the facts objectively.

Both of your supporting tickets are now closed, but unfortunately I do think we need to cover the "how" for something like this to be an ADR. Supporting YAML by the legacy mechanism or the config entry mechansim is very different. So one piece of consensus I was trying to achieve was that the legacy mechanism was out of scope.

The last paragraph of my recent replies was:

My personal opinion is that any attempt to allow YAML configuration to be available for setting up integrations must work by driving the config flow machinery so that there is only the async_setup_entry method and this split version of Home Assistant that we currently have must go away.

That was the most important paragraph in the entire piece. Everything else was really just flowery context leading up to it. You did seem to agree with this, and that's exactly what I expected from you because you seem to be an experienced developer like me and have invested in this quite a lot of your time into this. But you also spent quite a bit of time refuing most of my examples of what happens when there is no config entry. Unfortuntely this lead to a massive misunderstanding - I thought you had misunderstood how config entries worked and you must have thought I was either an idiot or lying about currentlYAML limitations.

So to be clear. The examples I gave are of what happens when there is no config entry. Both HomeKit examples I gave are things that rely on the device registry, and so it will not work without a config entry.

With that misunderstanding cleared up hopefully now we can simply agree or disagree with the statement - any yaml configuration mechanism must be via config entries.

Btw, part of your last answer to me was to reiterate why you think #401 is a bad idea. I am unsure why you felt the need to bring it up again right now. I know you are keen to debate it, and I hope that is the only reason. But as you yourself have said repeated;y this ticket is not an appropriate place. It remains the case that I don't think #401 is even an implementation of #399, and so let me reiterate that I won't be discussing it in that context.

OnFreund commented 4 years ago

So either the UI is not reflecting the change or (obviously more likely), as you stated, the YAML changes are being ignored.

If I'm reading the code correctly, the latter is correct.

Harmony still supports YAML as per documentation

Technically speaking, the documentation mentions "manually configure the device", but nothing about updating that config (contrast with the section describing the UI configuration case, where it specifically mentions that you can later adjust some options). I think we can say it's technically correct, but could be misleading, and would encourage you to submit a PR to the docs.

I also have not seen any notifications on my system that this is deprecated or the entity has been migrated

Some integrations specifically mention that the configuration was imported from YAML, but in any case, the existence of a the integration on the integrations configurations page is an indication that it was imported (for integrations you haven't explicitly configured through the UI).

There is a common pattern for importing the YAML configuration into a config entry (you can search for async_step_import and will find quite a lot of integrations doing that), but AFAIK it still requires potentially meaningful work on the integration side, especially for integrations that can have multiple instances and need some identification mechanism (and in some cases, YAML configurations can't have unique IDs at all).

nitobuendia commented 4 years ago

Hi @Jc2k

Thanks for the clarifications.

I think we are aligned in that any YAML-like solution would be best implemented by relying on the "UI" config entry mechanisms. Both because it might be better off for users; as well as lower maintenance for developers.

I also personally think that those who opt-in for YAML are not interested in UI. As such, I do consider a fair trade-off that if a user configures a component via YAML, they should not expect to have those integrations/devices in the UI, or able to use them on Automations on the UI. As a user, you would need to make a decision between the easiness of the UI or the flexibility of YAML for each of your components (and assuming you can do integration A in UI and B in YAML). This trade-off is better than not having YAML option at all; and the costs described in the original post. (Of course, if we can come up with a YAML system that supports devices in UI, etc, it would still be better than if it doesn't; and therefore reinforces the same point of re-using the UI config flow mechanisms).

I also do think that the details on how such a thing would work are very important; mostly for developers and technical feasibility, but also partially to ensure that the user flows are fixed. For example, implementing something that must rely on an HTTP endpoint would not work for me, as it won't work if there's a bug and my UI breaks. In any case, because the details matter, I keep bringing #400 and #401 (and will bring any others, if anyone brings other suggestions). While I think the topic is to be had separately, I do feel it's a discussion that must happen before or in parallel to approving #399.

I am not trying to win internet points either. I just want to be happy with a system to whom I've trusted a big chunk of my daily life management; and, for me, that includes YAML configurations. And I am sure it does for other people too, based on the Community discussions that I had.


Hi @OnFreund

Super insightful. Learning a lot. Thanks.

If I'm reading the code correctly, the latter is correct.

Yes. That makes sense.

So, I guess it's time for me to explore creating another custom component that would update the config entry if there are discrepancies between both systems (as opposed to a return) :)

I think we can say it's technically correct, but could be misleading, and would encourage you to submit a PR to the docs.

Aligned on this. For two main reasons: (1) my understanding is that historically those Edits have worked, and (2) for components where UI config flow is not available, this would still be the case. So unless I am an active user checking the Blogs and Community, this would be extremely confusing for some people. (Why component A updates with YAML changes, but B does not?)

In some components, there are logged messages saying that the platform_setup has been deprecated. Maybe in these cases we should add a Logging message saying that the entity has been configured/migrated and no YAML changes would be applied.

If you could point me where to file the issue, and whether you think I should file particularly to Harmony or overall, I am happy to create it. Although, I have a strong feeling that it might be more welcomed if someone else does it (ref).

There is a common pattern for importing the YAML configuration into a config entry (you can search for async_step_import and will find quite a lot of integrations doing that)

Thank you. Will check a few to better understand what's the situation there and whether I could think a good way to leverage this mechanism for the problem at hand.

AFAIK it still requires potentially meaningful work on the integration side, especially for integrations that can have multiple instances and need some identification mechanism (and in some cases, YAML configurations can't have unique IDs at all).

Do you have more details on what exactly drives that cost? Trying to understand what we should aim at solving / minimizing.

Regardless, if I recall correctly from the podcast, this is something that was being done to all (most?) components when the transition from YAML to UI entries happens. Additionally, the UI entries should not break in transitioning versions, this is why we have upgrade mechanisms. As such, I understand that the high cost is invested anyway, right?

If this is the case, the question should be whether there's a big cost of keeping this system say "forever" and to also make updates. I can foresee two main issues with re-using this for a long-term purpose:

  1. How to match YAML configs to UI config entries, especially to detect and apply changes (since entry_id is not part of YAML).

  2. We would potentially be unable to remove all the migration code (say from VERSION = 1) to current version since a YAML user may have the old configuration (say v1) and we would still need to be able to migrate from v1 to current version (say v5) all throughout. However, UI entities could potentially be migrated progressively, so perhaps we only need from v3 to v5 on the codebase.

Is my understanding correct? Do you foresee other issues there? Would love to take a stab at this approach, even if it may closed equally fast as the others.

OnFreund commented 4 years ago

If you could point me where to file the issue

The easiest way is to edit the relevant page and submit as a PR. I'm not sure what other integrations it applies to (but searching the code for async_step_import should help you find them).

Do you have more details on what exactly drives that cost?

From my very limited experience, there are two main factors:

  1. As I mentioned in my previous comment - unique identification is such a cause of friction, especially in cases where there's no unique ID. The outcome of https://github.com/home-assistant/architecture/issues/333 basically means that integrations where multiple instances are supported but there's no unique ID can only work through the UI.
  2. Validating text based configuration can be different than the stepwise approach that the UI is taking. The result of one step can determine which step to show next, or what to show in it. A text based configuration can have more options and requires additional validation, and subsequently different tests.
nitobuendia commented 4 years ago

Hi @OnFreund

The easiest way is to edit the relevant page and submit as a PR. I'm not sure what other integrations it applies to (but searching the code for async_step_import should help you find them).

Makes sense. I was also thinking maybe adding a logging line here.

I need to confirm on my side that it's fine to submit those PRs, but should be doable.

As I mentioned in my previous comment - unique identification is such a cause of friction, especially in cases where there's no unique ID. The outcome of #333 basically means that integrations where multiple instances are supported but there's no unique ID can only work through the UI.

If I understand this correctly, there are two problems:

  1. When a component is configured, it creates entry_ids for config_entries. If we change YAML, there's no way to match these two, because entry_id does not exist in YAML.

  2. Each entity/device should have its own id and be mapped to a config_entries.

333 seems to touch on the second problem. This issue is inherent to the devices, and becomes apparent because the id system of the new config flow. As such, this is not related to YAML; although, of course, it would also affect if we also use the same flows.

The first one is the only inherent problem to YAML configuration. A good solution could be to add entry_ids to our YAML configurations. For example:

This was my configuration:

hue:
  bridges:
    - host: !secret hue_bridge_lan_ip
      allow_unreachable: false
      allow_hue_groups: false

Which was ported as a config_entry

            {
                "connection_class": "local_poll",
                "data": {
                    "bridge_id": "001-redacted-B07",
                    "host": "192.168.1.105",
                    "username": "zA2k-redacted-aXnzEO80g1KYpNao"
                },
                "domain": "hue",
                "entry_id": "dd73b58e701f4b0486be84a80c18d592",
                "options": {},
                "source": "user",
                "system_options": {
                    "disable_new_entities": false
                },
                "title": "Living Room",
                "unique_id": "001-redacted-07",
                "version": 1
            },

We could change the schema to:

hue:
  version: 1
  bridges:
    - entry_id: dd73b58e701f4b0486be84a80c18d592
      host: !secret hue_bridge_lan_ip
      allow_unreachable: false
      allow_hue_groups: false

Now, I should be able to match both configurations. The source parameter ("source": "user",) could be used to know if it's from the UI, migrated from YAML or still relying on YAML.

Now, this entry_id came after the migration. We need to discuss how to generate it if needed to be in the YAML before it gets created on the config_entries.

We could have a Developer Tools section where you can generate new ids. This would basically be a call to uuid, not actually writing things on the system. Advanced users may do that directly on Python on their terminal. Alternatively, the user could input them manually (similar to entity_id), but this would require good documentation to explain that if you change this id, the configuration will be affected. This entry_id would be set in the config_entry when read from YAML, and both are now linked. Therefore, any changes in YAML can be reflected/updated on the respective config_entry.

Validating text based configuration can be different than the stepwise approach that the UI is taking. The result of one step can determine which step to show next, or what to show in it. A text based configuration can have more options and requires additional validation, and subsequently different tests.

It would be easier to talk with one example that has multiple config possibilities. I think KNX might be a good one, but if you have better examples, we can discuss.

KNX supports configuring (1) routing or (2) tunneling. Depending on this option, (1) host is required and port+local_ip are optional; or (2) just local_ip is available and required. (I am omitting config_file and other parameters for simplification purposes, but the same applies to them).

There's no config flow on KNX, but we can imagine it something like this:

  1. The UI asks for routing or tunneling.
  2. If routing: the UI shows 3 fields. If tunneling: the UI shows one field
  3. If routing, the config_validation schema would be 3 fields, host required. If tunneling, the schema would be local_ip required.
  4. The UI would validate either of these schemas and then store it.

You're right that the UI if/else structure would end up selecting one of these validations and probably just call a show_form with these schema.

How to avoid having to replicate this for YAML and UI? The schema decision could be abstracted into a common function. You could potentially pass the final data that you have (in UI this comes on the last step, in YAML comes all together), and the method would return which schema should be applied. Then, you can pass this schema to show_form or you can simply validate it.

This probably would not work or be detectable on the "Check Configuration" button today, as if I am not mistaken, that one relies on a static schema, though. That logic could be changed centrally; or, anyway, it could be catchable on async_platform_setup.

I do see how this adds a bit of complexity, so probably some more thought would be needed to make sure it doesn't add substantial cost. I think I could try an experiment see if I can get a proof of concept; but knowing how my other requests have been directly closed out, it's probably not worth the time investment.

OnFreund commented 4 years ago

333 seems to touch on the second problem. This issue is inherent to the devices, and becomes apparent because the id system of the new config flow. As such, this is not related to YAML; although, of course, it would also affect if we also use the same flows.

That's one way to phrase it, but I'm not sure the way you broke it up into the two problems is the correct way to look at it. In any case, obviously a lack of unique ids is not an inherent problem with the YAML format, but given the decision that devices without a real unique id must use the config entry id as their unique id, any YAML based solution becomes error prone.

How to avoid having to replicate this for YAML and UI? The schema decision could be abstracted into a common function. You could potentially pass the final data that you have (in UI this comes on the last step, in YAML comes all together), and the method would return which schema should be applied. Then, you can pass this schema to show_form or you can simply validate it.

Yes, but that means you're going with the lowest common denominator, which requires more validation. This isn't necessarily wrong, but it does add significant complexity. You can also have cases where next steps depend on input from the server, after connecting to it, which are even more complicated. This is why I personally, despite my fondness of YAML based configuration, stopped believing we can make this work. Unfortunately, it's a fight we've lost. I do still believe that @Jc2k's suggestions of manifests and a tool that uses the same API as the UI could be amazing and bridge a lot of the gap. I also believe we should make the .storage folder more VCS-friendly, alas, this is also a losing battle.

This probably would not work or be detectable on the "Check Configuration" button today, as if I am not mistaken, that one relies on a static schema, though.

AFAIK "Check Configuration" mostly checks the schema and syntax, not the semantics.

Jc2k commented 4 years ago

Disclaimer: This is not meant as a sales pitch for my approach, merely some ideas about how to work around the unique id problem.

For my approach what i'd hope to do was something like how kubernetes does it. Despite having an internal unique id, when referencing objects in your system you match them using selectors. Sometimes you can just match on a name attribute. Other times you can match on labels.

At the very minimum every config entry has a title and a domain (integration name). I do not know if there is currently a requirement for these to be unique, but regardless of supporting a way to use YAML there is a strong case that in general they should be.

Where a device doesn't have a good enough title i'd propose to match on the data attributes of the config entry like label selectors in kubernetes. That way In the case of homekit_controller I could match on IP address for example.

Right now i have a hue bridge paired by homekit. It's title is Philips hue - 111111 and its domain is homekit_controller. It's pairing code is also static. So I could express it as:

selectors:
    title: Phillips hue - 111111
    domain: homekit_controller
config:
    device: <device id as it appears when doing manual discovery>
    pairing_code: 111-11-111

(Not related, but the idea is that the config key is a bit like how preseeding works on Debian. The CLI would drive the form API automatically and if all the data was supplied the config entry would be created, if it wasn't it would prompt at the terminal).

OnFreund commented 4 years ago

That way In the case of homekit_controller I could match on IP address for example

But IP addresses change, and are specifically prohibited as unique IDs in HA...

Jc2k commented 4 years ago

That’s why I called it a selector and not a unique I’d. I know I have given my devices stable ips. Someone without a stable ip would have to match on something else. Also if it was the difference between yaml and no yaml I think you could do dhcp reservations..

OnFreund commented 4 years ago

But then you're reliant on external environmental traits. I'm not a big fan of that - I'd rather let the user be explicit.

Jc2k commented 4 years ago

Yeah i do get where you are coming from.

My approach was based on something that's entirely external to HA and trying to make as few changes to the existing API's as possible, whilst still trying to be somewhat "ergonomic". I'd hoped to prove it worked, then making changes upstream to make it work better. We'll need to make those changes up front I guess.

I don't think using the actual randomly generated id's is a very ergonomic or pleasant experience. That very much would feel like YAML was hacked on the side rather than thoughtfully designed and integrated.

The title attribute is stable and would work in most cases as a selector (I only refrain from using "all" because i haven't tested every config entry). It's not an external environmental trait so i think meets your requirements. That said some of them do have IP addresses in them. I don't think you can change it via the UI at present. The biggest kicker for me is that it isn't predicatable - each integration gives a different name. So if you wanted idempotent YAML you'd have create the config entry and see what title Home Assistant popped out, then update your YAML with the title. Whilst less ugly than a UUID, the user experience is not great either.

(incidentally, that's why i'd match on ip or email or something in my own environment - better workflow and i know my ip's are stable, but obviously we can't expect that to be the case for everyone).

If we had the green light to implement YAML on top of config entries right now, and it was baked into HA not via the API, what would you do?

I'd probably let the user provide their own id or label for an integration. I'd either use that as a unique id (unlikely to be permitted?) OR i'd store it as an attribute and match on it when trying to find the entry again later. I'd be tempted to accept the risk that the user might do something unexpected (like renaming it) - "if you break it you keep the pieces". I'd make sure this didn't leak into the front end, other than the UI knowing that particular config entry was read only. To convince the devs that this wasn't a PITA for integration dev's i'd suggest a rule saying you need to be able to reproduce the error without using YAML before it was passed on to the integration CODEOWNER.

OnFreund commented 4 years ago

I don't think using the actual randomly generated id's is a very ergonomic or pleasant experience. That very much would feel like YAML was hacked on the side rather than thoughtfully designed and integrated.

Agreed.

I'd probably let the user provide their own id or label for an integration

That's exactly what I meant by "I'd rather let the user be explicit", and what I was aiming for in https://github.com/home-assistant/architecture/issues/333. I think that users, especially ones opting in for YAML, could be trusted with this, along with a simple validation to ensure uniqueness (mostly to avoid copy-paste errors. Additionally, it opens us some cool possibilities - if a config entry had a unique id that the user can obtain, that in the future could be used to target entities that were created by this config entry (so in the Monoprice example that started that arch topic, you could, for example, mute all of the zones for a specific instance). Since I suggested this, config entries gained renaming capabilities, which I think makes more proposal make even more sense - along with renaming the config entry, what if you could set an id (just like the entity id lets the user set both a name and an id), and use that for both addressing the config entry (either in YAML, or in manifests), and for selecting entities for service calls?

nitobuendia commented 4 years ago

Hi @Jc2k @OnFreund -- it was really nice to read your debate. Thanks for putting the time and thought.

I also believe we should make the .storage folder more VCS-friendly, alas, this is also a losing battle.

+1 to #370 - Is there a way we can support you? The team does not look very open to the idea either, but I think it's a start into better documented and better controlled schemas in the new system.

The only thing I find difficult with #370 is that I want that my local environment to be the source of truth for configurations and control of what's get pushed to my git repository. Anything to do with .storage means I need to either push changes from the system itself or download them, vet them, and then upload them. This might be just me. Aside this, I think it's a really great idea.

Sometimes you can just match on a name attribute. Other times you can match on labels. The title attribute is stable and would work in most cases as a selector Whilst less ugly than a UUID, the user experience is not great either. I'd probably let the user provide their own id or label for an integration. I'd either use that as a unique id (unlikely to be permitted?) OR i'd store it as an attribute and match on it when trying to find the entry again later

In all fairness, I am not fond of uuids either. I prefer to use arbitrary names and store them as a matching key rather than the actual unique id.

Arbitrary names (i.e. entity_id) have been deemed not useful on the podcast because the user can change them and break the link. However, maybe it would be different as long as it's not the main id and it's not used for anything else other than connecting both configs.

The idea of using a uuid was to indicate it's not something "you should change" and make sure it's unique. It's also used in some places to generate ids (1, 2). I was just trying to propose an easy way for the user to come up with something "unique enough".

At the end, I agree that we "just" need a way to match a YAML config to a UI config.

"if you break it you keep the pieces". I'd make sure this

I also think that if you changed the unique id (or matching key, or whatever), you should expect things to not work as intended as you are now editing or creating a different entity. Partially, allowing the user to input anything, makes me think that at some point they may no longer be happy with what they've input and change it.

"along with a simple validation to ensure uniqueness (mostly to avoid copy-paste errors"

Indeed. Uniqueness can be checked both on "Check Config" as well as at startup.

Next steps

Thanks @OnFreund and @Jc2k -- I think one experiment I want to do is link my Harmony YAML to my Harmony .storage, add a unique id to link both and see if I can get it working on the UI, but editable with YAML. Of course, this is not safe now, but I want to see if something like this could work.

Unfortunately, it's a fight we've lost.

Hopefully not :)

OnFreund commented 4 years ago

+1 to #370 - Is there a way we can support you? The team does not look very open to the idea either, but I think it's a start into better documented and better controlled schemas in the new system.

I guess the best way would be contributing PRs that get us there. The core team was not strictly opposed to most of what I proposed there, but didn't think it was a high priority to be worth the effort. If we could show good implementations, they might change their minds.

frenck commented 3 years ago

Look at this topic quite sometime later, we can safely conclude this doesn't carry the support from most of our members (based on 👍 reactions).

I think the time is come to close this proposal.

nitobuendia commented 3 years ago

@frenck There are 10 👍 on the initial proposal vs 11 👍 on this response or 12 👍 on this one. I wouldn't say that showcase much of a difference.

In any case, I also do not think that 👍 might be the best way to determine support, feasibility of what's best for the user; and I also do not think that your opinion is going to change even if the original response had 50 👍 ; so yes, let's close it.

frenck commented 3 years ago

The difference is in the people giving the response here @nitobuendia. This proposal is rejected by most members of the project members, does not carry consensus in building or maintaining the proposal.

nibblerrick commented 3 years ago

Still would liked to have more yaml-configuration back. If there is something in the future that makes more yaml-configuration again maybe have in the back of your mind that there are people who like this. Thanks for all the effort.

frenck commented 3 years ago

@nibblerrick That ship has sailed and has been discussed a lot. The decision on this has been made and the path is documented clearly in the ADR's.