home-assistant / architecture

Repo to discuss Home Assistant architecture
313 stars 100 forks source link

Support a read-only filesystem deployment #490

Closed gsdevme closed 3 years ago

gsdevme commented 3 years ago

Context

As the recent incident https://www.home-assistant.io/blog/2021/01/22/security-disclosure/ has shown 3rd party plugins and all developers can make mistakes. Its wise to assume an exploit whereby someone could write to the filesystem has the chance to occur. Allowing a deployment where the container engine restricts it to readonly would sandbox the whole environment from this kind of attack.

Removing the mutation of .storage would enable a much more secure deployment made possible with runtime engines like Docker and abstracted by Kubernetes

This solution would seem to provide a industry standard sandbox outside of anykind of custom implementation for plugins so wouldnt require any further thought.

Rationale Enabling this option forces containers at runtime to explicitly define their data writing strategy to persist or not persist their data. This also reduces security attack vectors since the container instance’s filesystem cannot be tampered with or written to unless it has explicit read-write permissions on its filesystem folder and directories.

https://www.projectatomic.io/blog/2015/12/making-docker-images-write-only-in-production/ https://kubernetes.io/docs/concepts/policy/pod-security-policy/#volumes-and-file-systems

Proposal

Remove mutation of the filesystem (specifically in .storage) and people would adopt out of the sqlite recorder to achieve this.

Consequences

Would require no further mutation of the filesystem is required.


Somewhat related to https://github.com/home-assistant/architecture/issues/472

Adminiuga commented 3 years ago

I concur. Most of config files should not be writeable by HA Core process and especially the "/config/custom_components" folder.

frenck commented 3 years ago

I concur. Most of config files should not be writeable by HA Core process and especially the "/config/custom_components" folder.

That is unfortunately not true. Most parts of HA relies on them to be able to write to them.

As for the suggested, it seems a feature request (which is not what we use architecture issues for).

Adminiuga commented 3 years ago

That is unfortunately not true. Most parts of HA relies on them to be able to write to them.

Yes, most parts of HA relies on writing to them and IMO that's a problem. Find and exploit a single integration to drop a payload to /config/custom_components and next reboot HA would load that integration and gets full access to HASS data, tokens etc. With that said, I accept to live with the current approach :) We can close this discussion

gsdevme commented 3 years ago

@frenck Im not sure id class this as a feature request as it fundamentally changes how the core needs to operate and "think"

That is unfortunately not true. Most parts of HA relies on them to be able to write to them.

This would be the suggestion for the architecture to not work like this, there isn't a specific reason its just how the current system works but changing something like this would be a mindset change and therefore architectural Id say.


Alot of 3rd party code currently around could utilise API calls over HTTP rather than the plugin API exposed directly in Python then the 3rd party code would run as sidecar container to home assistant (i.e. completely sandboxed)

frenck commented 3 years ago

This would be the suggestion for the architecture to not work like this, there isn't a specific reason its just how the current system works but changing something like this would be a mindset change and therefore architectural Id say.

Sure, but how are you planning on implement this and approach this?

gsdevme commented 3 years ago

Sure, but how are you planning on implement this and approach this?

@frenck Ah perhaps I've misunderstood, are issues only raised here if the implementation details are fully worked out?


Unfortunately for me python isn't my strongest language (far from it). Implementation wise Id think there would be some quick wins

  1. Allow the logger to write to stdout/stderr rather than a filesystem .log file. (normal standard container practice)
  2. Implement https://github.com/home-assistant/architecture/issues/472 to remove the usage of a flat file database inside .storage/
  3. Accept the limitations of lovelace UI being that you cannot make edits within the GUI (already the case in fairness when in (mode: yaml)
  4. Perhaps revert the decision to ban YAML integrations for new things given the hindsight (although I know thats a sore subject... but it might ease https://github.com/home-assistant/architecture/issues/472)

After this point the "core" of home assistant likely isn't writing to the file system. however 3rd party code becomes a failure point so Id likely need to understand more about the system but I would guess two implementations

  1. Implement more of the native python API via the HTTP API
  2. Implement more of a contract based broker for actions to allow communication of core and "insert 3rd party code"
  3. i.e. treat 3rd party code as 3rd party and not execute alongside our secrets and passwords but deployed in another sandboxed container will interactions limited by APIs and communication protocols.

One wonderful thing this might unlock also is hosted 3rd party integrations, again Im sorry I don't know the implementation details fully mapped out here. If there is a general agreement Im happy to help but I guess given the direction needed to be taken here its likely good to gauge interest.

frenck commented 3 years ago

The architecture repository is here for checking up with architectural issues/decisions you are implementing and need guidance or approval on or for organizational decision making.

If you are not implementing it, it doesn't belong in this repository but on the community forum, in which we have a feature request section. As it becomes a wish you'd like to have implemented.

gsdevme commented 3 years ago

I see, I guess Im happy to attempt these in that case. So I need to gain approval here?

frenck commented 3 years ago

Yep. However, I don't see the value of adding all that for a read only file system that add more limitations that solutions, with an added result it will add maintenance burden in our code base.

I think you should first index what would break and and define possible solutions for these things. Simply stating: "we have to accept that they won't work", won't cut it in the adoption or getting it into general use. And with that, it only adds (imho) useless code and thus maintenance burden on the project.

gsdevme commented 3 years ago

Agh right sorry I've sort of assumed an agreed understanding of the benefit of a read only filesystem from a security point of view. If that isn't the case I can further explain although I do feel there are well documented online resources.

Along with the link in my description there is a blog from redhat here https://www.openshift.com/blog/add-a-layer-of-security-to-openshift-kubernetes-with-cri-o-in-read-only-mode

Imagine you are running a containerized application which gets hacked. Hackers often want to put a back door in place, such that if the application gets started a second time, the application will already be running the hackers code. Running your containers in read-only mode prevents the hacker from modifying the application since /usr of the container is immutable. The hacker can’t write an exploit into the application. In the case of script kiddies the lack of places to write and execute code could block the hack altogether.

Further to my above suggestions because 3rd parties would integrate via HTTP/Brokers people would actually write plugins in different languages (Rust, Go etc etc) as another win not to say Python wouldnt be supported but a nice win.

Simply stating: "we have to accept that they won't work", won't cut it in the adoption or getting it into general use.

I haven't really said things wont work? I did say those that opt into the "read only mode" will basically be running like the lovelace mode:yaml.

To be clear I am also saying this should be opt in at least from the start, but the benefits could make this a good candidate for default overtime


What Id like to try to get out of this though is if this project wants to achieve this goal as its more of a mindset than about the implementation details for me, by above roughly outlines the basic concept of an implementation but it would be better to focus on "is this an accepted idea" or do we think the realistic truth that home assistant cannot use this security hardening approach.

frenck commented 3 years ago

What Id like to try to get out of this though is if this project wants to achieve this goal as its more of a mindset than about the implementation details for me,

Not sure how to answer that. This issue is so wide, big, generic and high level, there is nothing one can say about this IMHO. Sure? 🤷

Further to my above suggestions because 3rd parties would integrate via HTTP/Brokers people would actually write plugins in different languages (Rust, Go etc etc) as another win not to say Python wouldnt be supported but a nice win.

We have multiple integrations for this. For example MQTT. This is not a core architectural issue?

gsdevme commented 3 years ago

We have multiple integrations for this. For example MQTT. This is not a core architectural issue?

Sorry confusing terms I don't mean integrations in the sense of that, I mean 3rd party components registering themselves without natively running alongside in the same container engine.. although I guess similar to MQTT. However these integrations are for the purpose of controlling entities and alike. Im talking more about control of HA.

Anyway, it seems I can't perhaps articulate the benefits here and its getting abit lost in translation and clouded by other concerns that are not strictly about supporting the ability of running home assistant on a read only filesystem.

frenck commented 3 years ago

Further to my above suggestions because 3rd parties would integrate via HTTP/Brokers people would actually write plugins in different languages (Rust, Go etc etc) as another win not to say Python wouldnt be supported but a nice win.

Yes, MQTT is a good example of this.

gsdevme commented 3 years ago

Yeah Im going to close, I think the point is missed.