gramineproject / gsc

Gramine Shielded Containers (Docker integration)
BSD 3-Clause "New" or "Revised" License
40 stars 32 forks source link

RFC: Plugins/templates in GSC #139

Open dimakuv opened 1 year ago

dimakuv commented 1 year ago

What is missing in SGX signing flows, what is needed for "plugins/templates" in GSC?

The context are these PRs and discussions:

Woju's opinion

[the text is combined from Woju's messages in the chat]

The missing part is to fix Intel's idea about how to extend the signing Dockerfile with custom snippets. The signing plugins (PR 1118 proper) are more or less ready, but they're only part of the solution. In real life you might need to install some dependencies (which are not necessarily packaged as pip/python packages) and even run random commands (like login/logout to HSM API, cf. azure example, PR #20). And it's not clear to me how this should be done to be both elastic enough to cater to real-world use cases and rigid enough to prevent users shooting themselves in the feet. What's clear to me is that the original proposal in #112 is a footgun; I've talked about this in the meeting and written in the thread of #112.

So this probably needs a rethink, how GSC's jinja templates are organised (directory hierarchy in templates/ and filenames) to enable template inheritance that would be conductive to overriding relevant parts while still relatively strict to keep the necessary parts like the FROM device which drops the private key, see: https://github.com/gramineproject/gsc/blob/77b1f70e297a286a82708ec6c76b8e4243a5c1bd/templates/Dockerfile.common.sign.template#L14-L16

1118 is also relevant to one other PR in GSC (#118) that makes the --key argument optional-but-not-really.

It's true that this argument is flow-specific, i.e. applicable only to the single signing schema, with private key. This also needs to be re-thought and reworked.

So the parsing of arguments to GSC signing command is either plugin-specific or some other way. There's a question of packaging: can we package #1118 plugins and GSC plugins together? It would be convenient to do so, and while devising the proper plugin API we need at least to think about how to do it. Like another python's entrypoint that would return something? Another directory with overriding templates? Some way of parsing cli arguments?

One way of doing general parsing of cli arguments was discussed in one of the calls, and could be a generic -D, --define key=value argument very similar to gramine-manifest. That would dispense of general CLI parsing in GSC plugins, but there still will be need for templates/ overrides. This can be done relatively easy, because Jinja supports multiple template loaders and loads just the first found template, so we can insert this other template directory before the default.

Also, #1118 signing plugins will be general, while GSC problems with stuff in templates and inserting custom commands right in the middle of arbitrary dockerfiles are more-or-less GSC-specific. So I don't mind a GSC-only solution to such GSC-only problems.

So my idea how to do it: add --template-dir (or sth like that, the exact name TBD) instead of --template like proposed in #112. This is because Jinja likes loaders more than one-off templates. Somewhere in the comments in those PR's reviews is a snippet of how I would reconfigure jinja loaders when encountering this option to make it possible to inherit from template that's named the same. Then, rewrite relevant templates with {% block %}s to allow replacing respective sections and/or appending to them. This way we'll have option to later package this as python/pip packages, by adding entrypoint which will return a path to directory with templates, this directory being shipped as part of the plugin python package. Possibly even the same package as signing plugin.

As a side effect, this will expose GSC template names and block names as public APIs (plugin writers will need to override specific, named templates, and specific, named blocks inside those templates). Therefore this is a good opportunity to rewrite template names, move them between directories, etc.

Last point: #1118 is probably mergable as-is, possibly ironing out orthography in documentation. If you need to validate is against something, https://github.com/woju/gramine-sgx-otk is written to use this template API https://github.com/woju/gramine-sgx-otk/blob/6f75511ca8e9ca1e2c39c9dd5ff73d97cf298b49/setup.py#L16.

What to do about all this?

svenkata9 commented 1 year ago

What's clear to me is that the original proposal in https://github.com/gramineproject/gsc/pull/112 is a footgun;

I don't think it is. We are dealing with HSMs. We really don't need the below snippet - the private key isn't coming out. https://github.com/gramineproject/gsc/blob/77b1f70e297a286a82708ec6c76b8e4243a5c1bd/templates/Dockerfile.common.sign.template#L14-L16

dimakuv commented 1 year ago

I don't think it is. We are dealing with HSMs.

But #112 is supposed to be a generic template/plugin solution, not "only for HSMs".

svenkata9 commented 1 year ago

Well, in that case, all our templates can be modified and we result in the same footgun problem even with existing templates. Also, I see there are two ways that one can make any template a footgun including what we have today. 1) Modifying the templates itself to do some goofy thing to include keys 2) Modify the GSC template to copy explicitly the keys along with their own template.

With #112, we are only giving an option for the user to provide his Dockerfile as template. The basic assumption here is that they don't do explicitly some goofy thing as I mentioned in (1) above. And, (2) really needs active involvement from the user to modify the GSC code to copy stuff to inside the Docker image (along with modifying their own template).

And, we are providing the templates for AKV and OpenSSL based engines (which of course, users will need to customize because there may be many different HSMs based on OpenSSL).

My opinion is that proposal in 1118 makes it more complicated, and is much more vulnerable (including functional and security failures) if the user fails to do the right thing.

woju commented 1 year ago

My opinion is that proposal in 1118 makes it more complicated, and is much more vulnerable (including functional and security failures) if the user fails to do the right thing.

I have seen no evidence to support the statement that "1118 makes it more complicated, and is much more vulnerable (including functional and security failures) if the user fails to do the right thing". In fact this is the other way around: user needs to just install the plugin and provide basic parameters (like key location and/or credentials). It's not possible to make 1118-style plugins do the wrong thing just by using CLI parameters. This is in contrast to --template GSC proposal, which allows user to write a dockerfile that will ship a private key. It's true PR 1118 is not the full solution and there will need to be some adapter to GSC because of how GSC is structured, but it won't be "more vulnerable".

The difference here is in packaging/encapsulating: plugins are packaged and are not expected to be modified, only parametrised with a designated set or arguments. Only GSC plugin developers have an opportunity to screw things up (which is unavoidable; and they can only screw things in GSC, not in Gramine!). But it's certainly possible to write those plugins so that day-to-day users using only -D parameters can't do it wrong.

--template is free form, and there's no distinction between plugin developers and users, and all it's uses are in fact "active involvement" as you put it, and allows those users to easily do the wrong thing. That's why I call it a footgun.

Now there's a spectrum between those. What you proposed is a complete rewrite of the signing dockerfile. This dockerfile has a complication (FROM device) for a reason. Did you think about what happens when people remove that complication "for simplicity"? It still "works" (container runs), but it ships private key, which is wrong. And user won't notice. It can be done better: prompt users to overwrite only part of the dockerfile template. This way there won't be anything for the user to "simplify", so there will be less of risk. But it requires some work on our part, not just splicing the

Another way to understand it: this is not just a technical problem, this veers into psychology and UX design. Plain --template simply puts wrong incentives before the user. Yet another way to describe this is, I disagree with your assumption that people "don't do explicitly some goofy thing". They will, because most people are clueless, or to put it in more charitable way, they don't need to know implementation details of GSC. They need to be provided with bare minimum, and this FROM device is not among the things that we should require users to understand.

And, we are providing the templates for AKV and OpenSSL based engines (which of course, users will need to customize because there may be many different HSMs based on OpenSSL).

It's not clear how those will be different, and if there won't be a way to support more than one using a single plugin. PC/SC is a common interface, so the scope of adaptation between different implementations remains to be seen. Certainly AKV plugin won't need customisations between different users.

svenkata9 commented 1 year ago

Please go ahead push the finalized changes as per your view and provide how it works with GSC also - with AKV and OpenSSL engine validation.