opencontainers / runtime-spec

OCI Runtime Specification
http://www.opencontainers.org
Apache License 2.0
3.13k stars 535 forks source link

Clarify Intel RDT configuration #1196

Open ipuustin opened 1 year ago

ipuustin commented 1 year ago

First change: make it explicit when l3CacheSchema should have lines with MB: prefix filtered out. Previously it was unclear if this should be done always or only if memBwSchema did exist.

Second change: clarify when the subdirectory needs to be removed after the container is finished (not sure if the config spec is the best place for this?).

utam0k commented 1 year ago

Does this change require a change in runc behavior?

ipuustin commented 1 year ago

Yes it does... I don't think runc filters the MB:-lines at all -- it just concatenates the values. Also for directory removal, runc removes the directory if and only if ClosID is not set. It doesn't take into account whether or not runc has created the directory to begin with.

utam0k commented 1 year ago

I have looked at the major container runtime implementations, and only runc is affected. It would be good to have @AkihiroSuda, @kolyshkin, and @cyphar review this.

@giuseppe Just to be sure, crun doesn't implement Intel RDT, does it?

giuseppe commented 1 year ago

@giuseppe Just to be sure, crun doesn't implement Intel RDT, does it?

correct, it is not implemented in crun

utam0k commented 1 year ago

@giuseppe Thanks for your quick response 🙏

ipuustin commented 1 year ago

I have looked at the major container runtime implementations, and only runc is affected. It would be good to have @AkihiroSuda, @kolyshkin, and @cyphar review this.

@giuseppe Just to be sure, crun doesn't implement Intel RDT, does it?

Thanks for looking at this! I implemented the required runc changes now (in a linked PR).

kolyshkin commented 1 year ago

First change: make it explicit when l3CacheSchema should have lines with MB: prefix filtered out. Previously it was unclear if this should be done always or only if memBwSchema did exist.

This is kind of weird. Runtime is not a tool to groom configs. Why don't we just require that if both l3CacheSchema and memBwSchema exist, the first one should not contain lines with MB: prefix? This would be way easier to implement in config validation (strings.HasPrefix(s, "MB:") || strings.Contains(s, "\nMB:")) than the filtering.

kolyshkin commented 1 year ago

Alternatively, if the kernel is OK with getting multiple MB: strings, we can just write both l3CacheSchema and memBwSchema and the latter will naturally overwrite the former.

ipuustin commented 1 year ago

TBH I'm not sure why this is written the way it is now, but I think this may be due to originally L3 cache being the only supported resource (see https://github.com/opencontainers/runc/issues/433). The reason for l3CacheSchema containing MB: strings is probably for backwards compatibility during the transitioning from l3CacheSchema to having both l3CacheSchema and memBwSchema.

My target with this PR was to clarify the spec from the implementation point of view, but I fully agree that removing the filtering would make things simpler. I would however lean more towards your first option (erroring out if there are conflicting values). Having the "silent override" would probably confuse users and cause hard-to-debug bugs.

Erroring out in case of conflict would be a breaking change though -- I don't know if there are any workloads which currently set MB: strings from both variables. @marquiz since the previous wording is from your PR, do you have any insight or opinion into this?

marquiz commented 1 year ago

Alternatively, if the kernel is OK with getting multiple MB: strings, we can just write both l3CacheSchema and memBwSchema and the latter will naturally overwrite the former.

I think this should actually be true

ipuustin commented 1 year ago

@kolyshkin @marquiz I now updated the PR to say that the schemata changes should be applied in order. This is in line with the -"natural override" proposal and also maybe what the user has in mind. This will be a breaking change however, because now we don't filter anything out. If the MB domains set in l3CacheSchema are different than the domains in memBwSchema, both will be set.

marquiz commented 9 months ago

I now updated the PR to say that the schemata changes should be applied in order. This is in line with the -"natural override" proposal and also maybe what the user has in mind. This will be a breaking change however, because now we don't filter anything out. If the MB domains set in l3CacheSchema are different than the domains in memBwSchema, both will be set.

@ipuustin looking at this now, I'm inclined against a breaking change in the spec. Even though I don't think any runtime actually implements the filtering of data. But LGTM for the additions about directory removal

Related: with a "leave the old mess behind" mentality, I submitted PR #1230.

AkihiroSuda commented 7 months ago

@kolyshkin PTAL