Open jprotze opened 1 month ago
Curious whether the compiler removed the branch because rbuf
is uninitialized of because root
is uninitialized in this example? Agreed that both are UB and we should be careful not to promote that.
In our concrete code, we had root fixed to 0 and more code executed next to the malloc. A printf
would even print rank as 0 for all ranks:
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char **argv) {
if (argc != 2) {
printf("No valid size given\n");
return 1;
}
int n = atoi(argv[1]);
MPI_Init(&argc, &argv);
int rank, wsize;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &wsize);
n = n - n % wsize;
int n_per_process = n / wsize;
float *x;
float *x_root;
if (rank == 0) {
printf("From Rank %d: Initializing vectors...\n", rank);
fflush(stdout);
x_root = (float *)malloc(n * sizeof(float));
memset(x_root, 0, n * sizeof(float));
}
x = (float *)malloc(n_per_process * sizeof(float));
memset(x, 0, n_per_process * sizeof(float));
MPI_Scatter(x_root, n_per_process, MPI_FLOAT, x, n_per_process, MPI_FLOAT, 0,
MPI_COMM_WORLD);
MPI_Finalize();
free(x);
return 0;
}
I don't believe any of the examples are intended to be complete working code. Do they even include all the necessary headers to compile?
That is correct, most examples are code snippets. The code check uses LaTeX comments to describe things like headers and declarations needed to see if the code could be compiled - this test has caught many errors, such as misspellings, out of order arguments, and missing semicolons, in the code snippets.
However, we do want the code snippets to be valid and compileable with the additional info, so things like ensure rbuf is initialized is reasonable for the examples.
Here is a semantically equivalent reproducer without MPI to look at the compiler optimization observed above on godbolt. For clang the optimization seems to be introduced with version 14:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int test(int n, char *c) {
if (n == 0)
return 0;
return atoi(c);
}
int main(int argc, char **argv) {
char *buf;
if (argc > 1) {
buf = strdup(argv[1]);
}
int res = test(argc - 1, buf);
return res;
}
Problem
The wording about "significant only at root" is a bit sloppy and at least one example is broken. Although the parameters will not be read by the function on non-root ranks, passing uninitialized variables to the function is UB according to the C standard. Example 6.3 does not initialize
rbuf
on non-root ranks and passes this variable toMPI_Gather
, which is UB. As a result, we observed theif (myrank == root)
to be optimized away by clang 14 or newer and icx 2022.1 or newer.Proposal
Changes to the Text
Change
to
And possibly explain why this initialization is important?
In the introduction of Chapter 6:
Add:
Impact on Implementations
None
Impact on Users
Avoid running into non-obvious UB by misunderstanding “significant only at root”
References and Pull Requests