Open ThePhD opened 1 year ago
It took a while, but the paper was fully written.
Current standard revision: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3199.htm
Thank you for the proposal. I like it very much! I have one question regarding Section 3.8.3: It seems there is no specification over whether break
or continue
are allowed to be used to jump over a defer
. Do you think this type of jump over should be allowed or disallowed?
Pros
Consider the following use case:
for (int i = 0; i < 10; i++) {
void *ptr = malloc(8);
if (!ptr)
break;
defer { free(ptr); }
}
It seems we should allow such jump over in this example. Also, break
is semi-related to switch
, allowing break
to jump over seems making loops and switch
consistent on the constraint.
Cons
Some common goto
usages can also be expressed with/converted to loops plus break
or continue
, hence if goto
is banned, so shall break
and continue
. The wording of the draft also seems to imply return
is an exception. Also, if labeled loops are adopted into C standard, the jump over could be even more error prone.
Personal opinion
I implemented a C-based polyfill using __attribute__((cleanup(...)))
and used it in several of my projects.
During my usage, I find goto
and switch
-jump-over are indeed confusing. However, based on my personal feeling, the break
/continue
use case could be a good thing to have.
#define _DEFER_MERGE(a,b) a##b
#define _DEFER_VARNAME(a) _DEFER_MERGE(____defer_scopevar_, a)
#define _DEFER_FUNCNAME(a) _DEFER_MERGE(____defer_scopefunc_, a)
#define _DEFER(n) \
auto void _DEFER_FUNCNAME(n)(int *a); \
__attribute__((cleanup(_DEFER_FUNCNAME(n)))) int _DEFER_VARNAME(n); \
void _DEFER_FUNCNAME(n)(int *a)
#define defer _DEFER(__COUNTER__)
It closely resembles some behaviors defined by the draft (i.e., example 2, 4, 5 from Section 6.4), which can be tested here: https://godbolt.org/z/3vv5dMMqG
Thanks for your support!
Regarding jumps like goto
, break
, and continue
, the latest version of the proposal has the following constraints in the proposed wording:
Constraints
Jumps by means of
goto
orswitch
into E shall not jump over adefer
statement in E.Jumps by means of
goto
orswitch
shall not jump into anydefer
statement.Jumps by means of
return
,goto
,break
, orcontinue
shall not exit S.
S here is the scope of the defer
statement itself. So the only jumps that are banned are ones that jump out of a defer
, not ones that happen to skip its execution. Your example:
for (int i = 0; i < 10; i++) {
void *ptr = malloc(8);
if (!ptr)
break;
defer { free(ptr); }
}
is perfectly fine and valid. if ptr
is NULL
/a null pointer, then this code will simply exit the loop and the defer
will not be called because it has not been reached in that scope yet. We do not ban any use of goto
or friends that is appropriate, only banning if it jumps over a defer statement. You can goto
into some block/scope that contains a defer
, so long as you don't jump over it:
int handle = get_handle();
goto meow; // label after defer: using `goto` to jump forwards or backwards over is illegal
defer { release_handle(handle); }
meow0; ;
meow1: ;
printf("hiiii");
goto meow1; // okay
I think that covers the cases you were speaking of. Also, even if someone implements break
and continue
like a goto
, that's not of the C Standard's concern. The examples and informative (not normative) explanatory text demonstrating it in terms of goto
s is not a mandate for the implementation to follow: a C compiler (or interpreter, or transpiler, or any C implementation) does not have to care about such hints. They have to uphold the Constraints and Semantics sections, as given. That most flow control can be flattened/rewritten in terms of goto
is not something the C standard has to care about, so long as the exact use of continue
or break
follow what is in the standard before your compiler changes it / reorders it / messes with it.
One small organizational issue: the last paragraph of 3.4 (all of which is duplicated in N3198) states that "for the macros we provide almost every single one will be defined and have the value of 1
", which makes sense in the context of N3198 but not N3199. N3198 is linked later, in 3.4.2; if N3198 were mentioned and linked prior to the note about MSVC and SEH, then the note could allude to "the macros provided in N3198."
There are, as far as most reviewers can tell, no errors in the creation or deletion of the various kinds of resources (particularly, repeated memory allocations).
Most reviewers? Each of the returns inside the loop will leak memory on namelist in addition to all namelist[k]'s for k in [i, n). Great proposal, though!
Greetings, if implementation experience from an amateur compiler could help get this through, I wrote a little bit here along with the implementation.
Some feedback on C++ compatibility (hope this is relevant, since the draft does contain sizable words for that):
The lambda based scoped guard implementation under 4.4. The Polyfill/C++ Fix
may not behave as expected (as my understanding of the proposed C defer) when copy-elision is in effect, for example EXAMPLE 4
converted to struct may return 5 instead of 4 on all major C++ compilers: https://godbolt.org/z/q41rMo5Ex
My short discussion with LLVM people: https://github.com/llvm/llvm-project/issues/100869
The lambda based scoped guard implementation under
4.4. The Polyfill/C++ Fix
may not behave as expected (as my understanding of the proposed C defer) when copy-elision is in effect, for exampleEXAMPLE 4
converted to struct may return 5 instead of 4 on all major C++ compilers: https://godbolt.org/z/q41rMo5Ex
Nothing I can do about that, unfortunately. Maybe if C gets a form of applicable NRVO, but otherwise the intent and the effect is still matching and the same.
Greetings, if implementation experience from an amateur compiler could help get this through, I wrote a little bit here along with the implementation.
Thank you for the implementation. At the moment defer
is likely headed to a Techincal Specification unless we can convince folks that it should just go straight into the next working draft instead. I might do that but I'm tired of fighting people on this stuff so I'm just writing a Technical Specification; I will link to and include your work in the next revision of the proposal, however.
Thanks, just keep doing what you're doing. I'll keep stalking wg14 documents for fun stuff to implement.
Hello, I just wanted to ask if any provisions to completely solve the resource leak and dangling resource problems in C were mentioned by the committee during the discussions of this feature. I know this feature is supposed to help with that, and I appreciate it very much, but it doesn't protect the user in a language level manner (you still need "better analysers and linters" and many reviewers to gain confidence about your resource use, and you still cannot be sure that you haven't missed anything). That being said, thank you for working on this, it's really useful.
C cannot fundamentally solve this problem without changing core parts of C.
Have you considered writing Rust?
I'm currently learning Rust, but that is not what this is about. I was just curious whether the committee considers this a part of a broader strategy, or just a one-off feature. Nevertheless, thank you.
If you'd like to know the Committee Strategy, you can see the Charter that was updated for C2y, which added a much higher focus on application security: https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3280.htm
Nevertheless, the C committee cannot cook up features. People, like me, have to write proposals. They can be coordinated, but generally they are done independently. Other people are trying their own things which they have shown interest in standardizing but have written no papers yet, such as: https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854 https://llvm.org/devmtg/2023-05/slides/TechnicalTalks-May11/01-Na-fbounds-safety.pdf https://www.youtube.com/watch?v=RK9bfrsMdAM https://clang.llvm.org/docs/BoundsSafety.html
Thanks for all the work that this must have taken. It's exciting to see that things like this are on the cards.
I'm curious about the rationale for proposing the defer {<cleanup code>} <protected code>
structure rather than something like do {<protected code>} finally {<cleanup code>}
.
The do {...} finally {...}
structure would seem dramatically more intuitive to me as a piece of C syntax, but moreover I think it has some distinct advantages which would make it easier to teach and explain the semantics (especially when it comes to the interactions with return
/break
/continue
/goto
):
I'm sure this has had consideration, so I'd like to hear your perspective. I think I saw in the minutes of one of the meetings that there had been some debate on this aspect of the proposal (but there was no detail).
This is discussed during the "Implementation Experience" section: https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Improved%20__attribute__((cleanup))%20Through%20defer.html#experience
Specifically:
This means one has to move all variable declarations out of the try, in order to reference them properly in the finally. All in all, this means that while such a movement/changing of how variables are created in that block scope, it encourages unsafe practices such as keeping large amounts of uninitialized data. This encourages setting variables to uninitialized / sentinel values, and then properly initializing them in the try block before deploying a finally. We would rather not pursue such an option in C.
There is another option, which is __try { ... __finally: ... }
. This has the same benefits of appearing in the order in which it runs, but still within a limited scope. It is also intuitive because the code flows naturally into the __finally:
stage unless a return
already sent it there before returning. Was this considered?
Latest draft: https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Improved%20__attribute__((cleanup))%20Through%20defer.html