Open kevin-brodsky-arm opened 3 years ago
This sounds good; three further thoughts:
memcpy()
, possibly memcpy_nocap()
, and its expected behaviour. We should explicitly identify some use cases -- such as when we intentionally don't want to preserve tags (such as in copyin()
/copyout()
-style use cases).memcpy()
synonyms/wrappers, and indicate for each whether they are expected to preserve tags. For example, we might define that strcpy()
doesn't preserve tags, but that memmove()
and sort()
do (subject to suitable alignment/etc).A further note from the meeting earlier today: We should also be documenting whether memory-mapping APIs produce tag-enabled mappings, whether by default or as a result of additional flags/arguments/etc. For example, we probably want tags enabled for MAP_ANON mappings by default with mmap(2) (as is the case today), but System V shared memory mappings should not (but we probably want an option/flag to enable it).
Tagging @bsdjhb @brooksdavis @arichardson @jrtc27.
Attemping to answer one part of the question: memcpy
and memmove
must preserve tags any time they copy sizeof(void * __capability)
bytes where the source and destination are aligned. E.g., this needs to work:
struct s {
uint64_t a;
uint64_t b;
void * __capability c;
};
void init_from_other(struct s *dst, struct s *src)
{
memcpy(&dst->b, &src->b, sizeof(struct s) - offsetof(struct s, b));
}
One could imagine a more restricted C implementation (e.g. with strict sub-object bounds) that didn't preserve tags with unaligned starts, but for existing systems code this probably must work.
I think I've convinced myself that *sort
need only preserve tags for objects aligned to sizeof(void * __capability)
and which are a multiple of sizeof(void * __capability)
bytes, but IIRC we preserve as with memcpy
today so you can do absurd things if you really want to.
A lot of memcpy calls are emitted by the compiler (e.g. for assignments) and those would copy the entire object. For these cases it would make sense to emit a call to a memcpy variant that doesn't preserve tags on unaligned starts.
@arichardson has done some work looking at compiler generated copies in the context of improved inlining (https://github.com/CTSRD-CHERI/llvm-project/pull/506). We probably do want many of them to be tag-clearing, but de-facto C requires copying in all sorts of awkward places. For example:
struct s {
uint64_t a;
uint64_t b;
char c[16];
} __attribute__((aligned(16)));
requires a tag-preserving memcpy because we can't know what's actually being stored in c
since the C language can't differentiate between a string and a bag of bytes. (We likely want an annotation to say a string is actually a string or to push for a byte
type as I believe is being discussed). One could implement a C dialect that restricted tag preservation further, but the cost of adaptation would start to climb so I believe it would need to be optional.
We could try to make this distinction for C++20 (or maybe it's 17) code by only treating std::byte as potentially tag-bearing and assuming that char is actually a string. However, I feel this could be rather risky and it's safer to assume that all of signed char/unsigned char/char/std::byte
can potentially hold tags.
As long as a char*
is allowed to alias any other pointer (which I presume is still the case in C++17/20), I think we should preserve the assumption that an array of char may hold capabilities, because otherwise it feels like the departure from C/C++ is too great and it could break quite a lot of software. Of course having an optional compiler flag to remove that assumption wouldn't hurt either.
By and large, the current approach used by CHERI LLVM is to preserve capability tags when copying memory if it is valid for the source to contain capabilities. While it is clearly the case in some situations (e.g. during a struct assignment if the struct contains capability types), this is not obvious in general.
It would be very helpful for the guide to document:
memcpy()
, struct assignment, implicit copy constructor, etc.).