rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
93.93k stars 12.09k forks source link

Use functions from `crt_externs.h` on iOS/tvOS/watchOS/visionOS #125225

Closed madsmtm closed 1 week ago

madsmtm commented 2 weeks ago

Use _NSGetEnviron, _NSGetArgc and _NSGetArgv on iOS/tvOS/watchOS/visionOS, see each commit and the code comments for details. This allows us to unify more code with the macOS implementation, as well as avoiding linking to the Foundation framework (which is good for startup performance).

The biggest problem with doing this would be if it lead to App Store rejections. After doing a bunch of research on this, while it did happen once in 2009, I find it fairly unlikely to happen nowadays, especially considering that Apple has later added crt_externs.h to the iOS/tvOS/watchOS/visionOS SDKs, strongly signifying the functions therein is indeed supported on those platforms (even though they lack an availability attribute).

That we've been overly cautious here has also been noted by @thomcc in https://github.com/rust-lang/rust/pull/117910#issuecomment-1903372350.

r? @workingjubilee

@rustbot label O-apple

workingjubilee commented 2 weeks ago

I probably can't accept this in its current form, the way we handle args on macOS has problems:

We need to make getting args on all platforms Much Less Clever.

workingjubilee commented 2 weeks ago

Technically this diff is not making the situation worse I suppose? Hmm... and apparently things like googletest straight-up mutate the pointees of these, eesh... What do you think?

madsmtm commented 2 weeks ago

I'm not entirely sure I understand your linked issues? How could we implement these in a "Much Less Clever" way on macOS, while still allowing it to work in dynamic libraries?

Technically this diff is not making the situation worse I suppose?

Indeed, there aren't any functional changes on macOS, and [[NSProcessInfo processInfo] arguments] also takes the value from _NSGetArgc/_NSGetArgv (and, if I'm reading the disassembly correctly, stores a copy upon first access).

Hmm... and apparently things like googletest straight-up mutate the pointees of these, eesh... What do you think?

Pretty sure that mutating these in main, while not explicitly defined, should be valid, especially in testing scenarios. While researcing this, I found this 2022 proposal for C2Y, which under "Const correctness and thread safety" mentions:

main() can mutate its arguments. [...] we take the position that argv [h]as always been mutable, and users do rely on that for a few use cases including:

  • Hiding from users args that are specific to a given framework
  • Changing the name of the program as displayed by process monitoring tools
workingjubilee commented 2 weeks ago

Interesting.

...hmm, actually, one thing I noticed through all this. MacOS was never hardened against foolishness the same way that Linux was.

@madsmtm A program can update the pointee of the value of _NSGetArgc to have a higher value than the actual number of arguments available from the perspective of _NSGetArgv, or alternatively move the null terminator we will find from _NSGetArgv earlier. Please fix that by not only bounding our iteration to the argc but also checking for the null terminator and ending the iteration early if these values are made inconsistent. We cannot avoid a host program setting us up for UB in all cases but we can duck the most obvious error that reckless calling code might throw at us.

madsmtm commented 2 weeks ago

A program can update the pointee of the value of _NSGetArgc to have a higher value than the actual number of arguments available from the perspective of _NSGetArgv,

From my reading of the disassembly of -[NSProcessInfo arguments], _NSGetArgc can only contain a smaller value than the number of arguments available in _NSGetArgv.

or alternatively move the null terminator we will find from _NSGetArgv earlier. Please fix that by not only bounding our iteration to the argc but also checking for the null terminator and ending the iteration early if these values are made inconsistent.

And instead, when encountering a NULL argument in _NSGetArgv, -[NSProcessInfo arguments] skips over it, but continues processing the rest of the arguments up until _NSGetArgc. I.e. it's not treated as a NULL terminator, rather just as an ignored value.

I've implemented this behaviour in the latest commit.

workingjubilee commented 2 weeks ago

@madsmtm Hmm. Are you saying this doesn't work?

#include "whatever.h"

int overflow_argc() {
   *_NSGetArgc() = 9001;
    return 0;
}
madsmtm commented 2 weeks ago

Well... It's kinda muddied by the fact that -[NSProcessInfo arguments] is initialized by dyld the moment that Foundation is loaded (learned this just now by experimenting), and then a cached value is used ever since.

But if something else is loaded before Foundation and modifies _NSGetArgc/_NSGetArgv (which I think is how e.g. testing frameworks would usually do this, right? If not, how else would they ensure that the value that -[NSProcessInfo arguments] returns is updated?), then it's UB if _NSGetArgc is larger than the number of entries in _NSGetArgv.

This can be seen with the following:

// File: overflow_argc.c
#import <crt_externs.h>

__attribute__((constructor)) // Run the function upon loading this dylib
void overflow_argc(void) {
    *_NSGetArgc() = 9001;
}
// File: read_args.m
#import <Foundation/Foundation.h>
#import <crt_externs.h>

int main(int argc, const char * argv[]) {
    NSLog(@"argc: %i", argc);
    NSLog(@"*_NSGetArgc(): %i", *_NSGetArgc());
    NSLog(@"-[NSProcessInfo arguments]: %@", [[NSProcessInfo processInfo] arguments]);
    return 0;
}

Compile with:

clang overflow_argc.c -dynamiclib -o liboverflow_argc.dylib
clang read_args.m -L . -framework Foundation -l overflow_argc

Running ./a.out works, but if the -l overflow_argc argument is passed before -framework Foundation, it segfaults (before even running main).

workingjubilee commented 2 weeks ago

Hmm. Based on what you said, -[NSProcessInfo arguments] uses _NSGetArgc and not the other way around, right?

workingjubilee commented 2 weeks ago

For this exact moment I am most concerned about the behavior of Rust code compiled into a cdylib using this version of the standard library, assuming this patch is accepted, and then run with a C host that uses essentially no meaningful library code beyond the barest minimum, so my only question is "Can that C host make us do a UB by being reckless while updating the pointees of these functions in a way we can easily avoid?"

madsmtm commented 1 week ago

Hmm. Based on what you said, -[NSProcessInfo arguments] uses _NSGetArgc and not the other way around, right?

Yes.

For this exact moment I am most concerned about the behavior of Rust code compiled into a cdylib using this version of the standard library, assuming this patch is accepted, and then run with a C host that uses essentially no meaningful library code beyond the barest minimum, so my only question is "Can that C host make us do a UB by being reckless while updating the pointees of these functions in a way we can easily avoid?"

Yeah, I get the concern - and the answer right now is yes, we get UB if some random C code sets the value of _NSGetArgc to something that doesn't match _NSGetArgv.

And we definitely could just do a break in the loop instead of continue, my concern is that that has different behaviour from the standard API that's otherwise available on the platform (-[NSProcessInfo arguments]).

That is, well-behaving C code like the following (at least arguably more well-behaving than setting _NSGetArgc to some crazy value):

#import <crt_externs.h>

__attribute__((constructor)) // Run the function upon loading this dylib
void erase_first_arg(void) {
    *_NSGetArgv()[1] = 0; // Erase first argument
}

Would erase the first argument of -[NSProcessInfo arguments], but would erase all subsequent arguments in an implementation that used break.


So I think the question here is more: What is the semantics of *_NSGetArgv()[1] = 0;? Should it follow the (very implicit) platform convention and erase just that argument, or should it follow the current Linux implementation and "erase" the argument and all following it?

workingjubilee commented 1 week ago

Hm. As far as C is concerned, there is no real question here: the semantics of argv is that it is null-terminated, thus the only sensible behavior when you assign NULL to a value is that everything stops there.

The Objective-C code obeys a different semantic.

While there is no question as to whether or not Objective-C code has a place on macOS and even defines much of the platform's properties, thus we should not simply shrug off this behavior, nonetheless the real question that is hanging in the air is:

...Why did an Apple engineer commit that change?

...Unfortunately, the reputation of the company is that they will not be forthcoming with answers, even if we ask, so we are left to speculate. I feel I can assume their decision is not meritless, but I don't know if they were weighing things the same way.

Hm.

I'll sleep on it.

madsmtm commented 1 week ago

I'll add a bit more context for your deliberations:

So another solution could be to skip (i.e. continue instead of break on) NULL arguments on all platforms. This would also handle e.g. string allocation failure slightly better when doing e.g. argv[i] = strdup("foo"); (however unlikely allocation failure is at the start of program execution).

For myself, I think I slightly prefer the "continue parsing, don't break" option that Apple went with, though I am leaning towards wanting the same behaviour on all (unix) platforms, whatever we end up choosing.

workingjubilee commented 1 week ago

perhaps for historical reasons where argv wasn't originally null-terminated.

We do not support such platforms so I don't think we have to worry about that. A platform either supports at least ANSI C or it doesn't support C at all, to us.

So another solution could be to skip (i.e. continue instead of break on) NULL arguments on all platforms. This would also handle e.g. string allocation failure slightly better when doing e.g. argv[i] = strdup("foo"); (however unlikely allocation failure is at the start of program execution).

Hm.

What I am currently struggling to believe is that there are cases where the filter case actually recovers data more usefully than the take_while in the "well-behaved code" cases. All existing libraries that I'm aware of that permute the argv will also move NULL to the end. In other words, they also respect argv's spec. So it shouldn't matter which we do... unless the code is ill-behaved. In the ill-behaved code case, the heuristic that is simpler... "take minimum", break, take_while... prevents errors we would have otherwise encountered, and there is no reason to believe that the data beyond that first NULL is not corrupt.

workingjubilee commented 1 week ago

...In other words, for all major libraries I am aware of, -[NSProcessInfo arguments] will hit a NULL and then skip over it, and then hit a NULL and then skip over it, etc., for argc - (current_index) arguments.

A major benefit I could see of the filter behavior is to allow lazily iterating the argv since it removes directionality from the argv, instead of preallocating the strings, but

madsmtm commented 1 week ago

perhaps for historical reasons where argv wasn't originally null-terminated.

We do not support such platforms so I don't think we have to worry about that. A platform either supports at least ANSI C or it doesn't support C at all, to us.

Yeah, mostly noted it as a curiosity / an explanation for why argv is both null-terminated and has its length defined by argc (as opposed to just one of those).

So another solution could be to skip (i.e. continue instead of break on) NULL arguments on all platforms. This would also handle e.g. string allocation failure slightly better when doing e.g. argv[i] = strdup("foo"); (however unlikely allocation failure is at the start of program execution).

Hm.

What I am currently struggling to believe is that there are cases where the filter case actually recovers data more usefully than the take_while in the "well-behaved code" cases. All existing libraries that I'm aware of that permute the argv will also move NULL to the end. In other words, they also respect argv's spec. So it shouldn't matter which we do... unless the code is ill-behaved. In the ill-behaved code case, the heuristic that is simpler... "take minimum", break, take_while... prevents errors we would have otherwise encountered, and there is no reason to believe that the data beyond that first NULL is not corrupt.

Fair point.

I've pushed a new commit that merges the implementation on Apple with that of the other unixes, so that we use the "break" strategy on all platforms. I was unsure of exactly what to write as the reasoning, please say so if you have a better suggestion for the wording in my comments!

workingjubilee commented 1 week ago

That commentary looks good to me!

workingjubilee commented 1 week ago

@bors r+

bors commented 1 week ago

:pushpin: Commit 38ad85160314b446a08705974d3b03cc29c338c9 has been approved by workingjubilee

It is now in the queue for this repository.

simlay commented 1 week ago

Weirdly, I think think this uncovered a bug with the CoreFoundation not having the correct target_os annotations for visionOS and watchOS targets and thus failing at link time. https://github.com/servo/core-foundation-rs/pull/679 looks to be the fix. Figured it'd be worth writing it down in case someone else comes across a similar issue.