Closed pkillarjun closed 1 week ago
What Happened: I was trying to change the commit message in https://github.com/nginx/unit/pull/1292.
Suddenly, GitHub closed my PR. After that, I force-updated it, which corrupted the git history for that branch and uploaded junk in my PR.
I did fix everything, but I can't reopen that branch, thanks github. https://github.com/isaacs/github/issues/361
I did find found one fix, but before that, I deleted my branch and recreated it. Now I can't use this fix, it seems.
I get it. lhq->pool is passed through these functions for unbeknownst reason.
Not unknown actually. It's simply the prototype for these functions
typedef void *(*nxt_lvlhsh_alloc_t)(void *ctx, size_t size);
typedef void (*nxt_lvlhsh_free_t)(void *ctx, void *p);
Also a very simple question: Does this initialization of value break anything in this code-base?
I can't say with 100% certainty, but I'd doubt it.
However, while I'm not against doing things to appease tools, I want to fully understand why (and it needs to be fully explained in the commit message).
There are of course numerous reasons not to just go initialising everything, but one is that it could hide legitimate bugs.
In this case setting lhq.pool
to NULL
also seems dubious, as this parameter, for functions that use it, points to an area of memory for dynamic allocation.
On the other hand setting it to NULL
could simply mean it's not used... I'm really in two minds about this...
So I have a couple of questions
1) You say that the second patch hunk wasn't important, why is that? 2) Similarly, why is it important in the first case? Also I'm curious how exactly we get into the situation where lhq.pool is being used when it shouldn't?
Hmm, is it simply the act of passing lhq.pool (uninitialised) to the alloc() function, even though it isn't used, that it's complaining about?
Actually I'm wondering why we don't get a -Wunused-parameter
warning from
void *
nxt_lvlhsh_alloc(void *data, size_t size)
{
return nxt_memalign(size, size);
}
for the data
parameter...
Oh, damn! we're compiling with -Wno-unused-parameter
, I wonder what hell breaks loose if we don't...
I would have no issues with marking data
as __attribute__((unused))
, if that would be taken into account by the sanitizer?
This does however seem a bit of a misnomer
SUMMARY: MemorySanitizer: use-of-uninitialized-value /src/unit/src/nxt_lvlhsh.c:292:14 in nxt_lvlhsh_new_bucket
As we never actually tried to use it. Like it's valid to get a pointer to 1-past the end of an array, as long to you don't dereference it....
If we had actually tried to use it in the alloc() function, then sure... So in this case, I do think the sanitizer needs to be smarter...
You say that the second patch hunk wasn't important, why is that?
It was overwork. That 'bug' did not affect my fuzzers in any way, shape, or form, so I reverted it.
Similarly, why is it important in the first case? Also I'm curious how exactly we get into the situation where lhq.pool is being used when it shouldn't?
I don't think I understand your question, but I think you are asking about nxt_http_fields_hash;
Hmm, is it simply the act of passing lhq.pool (uninitialised) to the alloc() function, even though it isn't used, that it's complaining about?
Da, Simple sample to reproduce:
#include <stdio.h>
void test_func(int idx);
int
main(void) {
int idx;
test_func(idx);
return 0;
}
void
test_func(int idx) {
idx = 0;
printf("The init is: %d\n", idx);
return;
}
clang \
-O0 -fno-omit-frame-pointer -gline-tables-only -Wno-error=enum-constexpr-conversion \
-Wno-error=incompatible-function-pointer-types -Wno-error=int-conversion -Wno-error=deprecated-declarations \
-Wno-error=implicit-function-declaration -Wno-error=implicit-int -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION \
-fsanitize=memory -fsanitize-memory-track-origins \
file.c -o file
./file
I would have no issues with marking data as attribute((unused)), if that would be taken into account by the sanitizer?
This does however seem a bit of a misnomer
First, it's a bad idea to add it in production because it only works with Clang, and other compilers will probably throw an error. Second, it's:
__attribute__((no_sanitize("memory"))) static nxt_int_t
nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot)
__attribute__((no_sanitize("memory"))) static nxt_int_t
nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,
I would have no issues with marking data as attribute((unused)), if that would be taken into account by the sanitizer? This does however seem a bit of a misnomer
First, it's a bad idea to add it in production because it only works with Clang, and other compilers will probably throw an error.
__attribute__((unused))
is a bad idea?
It works perfectly fine in GCC... we already use it in Unit... I've been using it with GCC for years...
Clang and GCC are currently all we care about... (I just recently removed support for all kinds of weird compilers!).
Second, it's:
__attribute__((no_sanitize("memory"))) static nxt_int_t nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot) __attribute__((no_sanitize("memory"))) static nxt_int_t nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,
That's a little heavy handed, you only want to avoid the sanitizer for certain alloc() functions...
However I'm starting to lean towards applying this. It certainly needs a full explanation in the commit message however, which I'm happy to write, (including about if the compiler was smart enough it'd give a -Wuninitialized warning)...
For example it will catch this
/* u.c */
struct t {
void *p;
int len;
};
static void test(void *p __attribute__((unused)), int len)
{
(void)len;
}
int main(void)
{
struct t t;
t.len = 42;
test(t.p, t.len);
return 0;
}
$ gcc -Wall -Wextra -O0 u.c
u.c: In function ‘main’:
u.c:16:9: warning: ‘t.p’ is used uninitialized [-Wuninitialized]
16 | test(t.p, t.len);
| ^~~~~~~~~~~~~~~~
u.c:13:18: note: ‘t’ declared here
13 | struct t t;
| ^
$
But not this
/* u2.c */
struct t {
void *p;
int len;
};
static void _test(void *p __attribute__((unused)), int len)
{
(void)len;
}
static void test(struct t *t)
{
_test(t->p, t->len);
}
int main(void)
{
struct t t;
t.len = 42;
test(&t);
return 0;
}
$ gcc -Wall -Wextra -O0 u2.c
$
attribute((unused)) is a bad idea?
It works perfectly fine in GCC... we already use it in Unit... I've been using it with GCC for years...
Clang and GCC are currently all we care about... (I just recently removed support for all kinds of weird compilers!).
Microsoft(I don't care about them either, but it's just a thought.) ?
That's a little heavy handed, you only want to avoid the sanitizer for certain alloc() functions...
The patch would be something like this:
diff --git a/src/nxt_lvlhsh.c b/src/nxt_lvlhsh.c
index 7a8b3dd..5821446 100644
--- a/src/nxt_lvlhsh.c
+++ b/src/nxt_lvlhsh.c
@@ -284,7 +284,7 @@ nxt_lvlhsh_insert(nxt_lvlhsh_t *lh, nxt_lvlhsh_query_t *lhq)
}
-static nxt_int_t
+MSAN static nxt_int_t
nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot)
{
uint32_t *bucket;
@@ -448,7 +448,7 @@ nxt_lvlhsh_bucket_insert(nxt_lvlhsh_query_t *lhq, void **slot, uint32_t key,
}
-static nxt_int_t
+MSAN static nxt_int_t
nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,
nxt_uint_t nlvl, uint32_t *bucket)
{
diff --git a/src/nxt_main.h b/src/nxt_main.h
index 7880e55..929a3db 100644
--- a/src/nxt_main.h
+++ b/src/nxt_main.h
@@ -169,4 +169,10 @@ NXT_EXPORT extern nxt_task_t nxt_main_task;
NXT_EXPORT extern nxt_atomic_t nxt_task_ident;
+#if defined(__has_feature) && __has_feature(memory_sanitizer)
+ #define MSAN __attribute__((no_sanitize("memory")))
+#else
+#define MSAN
+#endif
+
#endif /* _NXT_LIB_H_INCLUDED_ */
MSAN
can be replaced with __no_instrument
, __no_instrument
can be used with ASAN
, UBSAN
and TSan
.
However I'm starting to lean towards applying this. It certainly needs a full explanation in the commit message however, which I'm happy to write, (including about if the compiler was smart enough it'd give a -Wuninitialized warning)...
Well __attribute__((unused))
isn't working, it still gives the same error.
attribute((unused)) is a bad idea? It works perfectly fine in GCC... we already use it in Unit... I've been using it with GCC for years... Clang and GCC are currently all we care about... (I just recently removed support for all kinds of weird compilers!).
Microsoft(I don't care about them either, but it's just a thought.) ?
Well, we don't run on Windows and Mircrosofts C compilers have historically at least been pretty terrible from what I hear.
That's a little heavy handed, you only want to avoid the sanitizer for certain alloc() functions...
The patch would be something like this: ...
MSAN
can be replaced with__no_instrument
,__no_instrument
can be used withASAN
,UBSAN
andTSan
.
No I mean disabling sanitization for these functions when it's only some of the alloc() functions that need to be skipped, which doesn't seem possible...
However I'm starting to lean towards applying this. It certainly needs a full explanation in the commit message however, which I'm happy to write, (including about if the compiler was smart enough it'd give a -Wuninitialized warning)...
While
__attribute__((unused))
is working for Msan, it still gives the same error.
No, I mean the setting of lhq.pool to NULL patch...
No I mean disabling sanitization for these functions when it's only some of the alloc() functions that need to be skipped, which doesn't seem possible...
Oh i Misread you. Yes, a line it not possible.
No, I mean the setting of lhq.pool to NULL patch...
Alright.
It certainly needs a full explanation in the commit message however, which I'm happy to write
You can do it, and thanks for the help;
Can you merge this PR and add a good commit message?
On Mon, 03 Jun 2024 23:36:09 -0700 Arjun @.***> wrote:
Can you merge this PR and add a good commit message?
I do plan to, yes...
Added a commit message
$ git range-diff 92739f79...f1b35623
1: 92739f79 ! 1: f1b35623 http: fix use-of-uninitialized-value bug
@@ Metadata
## Commit message ##
http: fix use-of-uninitialized-value bug
+ This was found via MSan.
+
+ In nxt_http_fields_hash() we setup a nxt_lvlhsh_query_t structure and
+ initialise a couple of its members.
+
+ At some point we call
+
+ lhq->proto->alloc(lhq->pool, nxt_lvlhsh_bucket_size(lhq->proto));
+
+ Which in this case is
+
+ void *
+ nxt_lvlhsh_alloc(void *data, size_t size)
+ {
+ return nxt_memalign(size, size);
+ }
+
+ So even though lhq.ppol wasn't previously initialised we don't actually
+ use it in that particular function.
+
+ However MSan triggers on the fact that we are passing an uninitialised
+ value into that function.
+
+ Indeed, compilers will generally complain about such things, e.g
+
+ /* u.c */
+ struct t {
+ void *p;
+ int len;
+ };
+
+ static void test(void *p __attribute__((unused)), int len)
+ {
+ (void)len;
+ }
+
+ int main(void)
+ {
+ struct t t;
+
+ t.len = 42;
+ test(t.p, t.len);
+
+ return 0;
+ }
+
+ GCC and Clang will produce a -Wuninitialized warning.
+
+ But they won't catch the following...
+
+ /* u2.c */
+ struct t {
+ void *p;
+ int len;
+ };
+
+ static void _test(void *p __attribute__((unused)), int len)
+ {
+ (void)len;
+ }
+
+ static void test(struct t *t)
+ {
+ _test(t->p, t->len);
+ }
+
+ int main(void)
+ {
+ struct t t;
+
+ t.len = 42;
+ test(&t);
+
+ return 0;
+ }
+
+ Which is why we don't get a compiler warning about lhq.pool.
+
+ In this case initialising lhg.pool even though we don't use it here
+ seems like the right thing to do and maybe compilers will start being
+ able to catch these in the future.
+
+ Actually GCC with -fanalyzer does catch the above
+
+ $ gcc -Wall -Wextra -O0 -fanalyzer u2.c
+ u2.c: In function ‘test’:
+ u2.c:15:9: warning: use of uninitialized value ‘*t.p’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
+ 15 | _test(t->p, t->len);
+ | ^~~~~~~~~~~~~~~~~~~
+ ‘main’: events 1-3
+ |
+ | 18 | int main(void)
+ | | ^~~~
+ | | |
+ | | (1) entry to ‘main’
+ | 19 | {
+ | 20 | struct t t;
+ | | ~
+ | | |
+ | | (2) region created on stack here
+ |......
+ | 23 | test(&t);
+ | | ~~~~~~~~
+ | | |
+ | | (3) calling ‘test’ from ‘main’
+ |
+ +--> ‘test’: events 4-5
+ |
+ | 13 | static void test(struct t *t)
+ | | ^~~~
+ | | |
+ | | (4) entry to ‘test’
+ | 14 | {
+ | 15 | _test(t->p, t->len);
+ | | ~~~~~~~~~~~~~~~~~~~
+ | | |
+ | | (5) use of uninitialized value ‘*t.p’ here
+ |
+
Signed-off-by: Arjun <pkillarjun@protonmail.com>
+ Link: <https://clang.llvm.org/docs/MemorySanitizer.html>
+ [ Commit message - Andrew ]
+ Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
## src/nxt_http_parse.c ##
@@ src/nxt_http_parse.c: nxt_http_fields_hash(nxt_lvlhsh_t *hash,
Rebased with master...
$ git range-diff f1b35623...04a24f61
-: -------- > 1: 4fc50258 ci: Be more specific when to run the main Unit checks
-: -------- > 2: e77a0c16 Tests: explicitly specify 'f' prefix to format string before printing
-: -------- > 3: c9dced37 Tests: print unit.log on unsuccessful unmount
-: -------- > 4: 98983f3f Add a GitHub discussions badge to the README
-: -------- > 5: a7e3686a Tools: improved error handling for unitc
-: -------- > 6: d7ec30c4 ci: Limit when to run checks on pull-requests
1: f1b35623 = 7: 04a24f61 http: fix use-of-uninitialized-value bug
oss-fuzz Issue 68458. My Old PR https://github.com/nginx/unit/pull/1292