ryanhaining / cppitertools

Implementation of python itertools and builtin iteration functions for C++17
https://twitter.com/cppitertools
BSD 2-Clause "Simplified" License
1.37k stars 115 forks source link

Internal linkage of iter::chain leads to ODR violation. #104

Closed MaksimIM closed 4 months ago

MaksimIM commented 6 months ago

Having chain in anonymous namespace here makes it internal linkage; when a template instantiates multiple versions of it, an ODR violation occurs.

ryanhaining commented 4 months ago

The cause is easier to see with a repro that uses a mutable variable. A header defines a variable in the anonymous namespace, then modifies it in an inline function. The code and Makefile are available in inlinefuncodr.tar.gz

// example.h
#ifndef EXAMPLE_HDR_
#define EXAMPLE_HDR_

namespace {
  int x;
}

inline void inc_x() {
  ++x;
}

#endif

Then two very similar files:

// ex1.cpp
#include "example.h"

#include <iostream>

void a() {
  inc_x();
  inc_x();
  std::cout << "a(): " << &x << " = " << x <<'\n';
}

and

// ex2.cpp
#include "example.h"

#include <iostream>

void b() {
  inc_x();
  inc_x();
  std::cout << "b(): " << &x << " = " << x <<'\n';
}

and finally a main

// main.cpp
#include "example.h"

#include <iostream>

void a();
void b();

int main() {
  a();  // should inc_x twice and print 2
  b();  // should inc_x twice and print 4
  std::cout << "main: " << &x << " = " << x <<'\n';  // should print 4
}

Expected output would be the same address for x and 2, 4, 4.

The exact output differs by compiler flags but I get either 2, 2, 0; or 2, 0, 0. Always with three different addresses.

The problem is an ODR violation: inc_x() differs across translation units because the anonymous namespace is creating a different x with each #include. The x in ex1.cpp is not the same as the x in ex2.cpp. We get undefined behavior. (UBSan with -fsanitize=undefined doesn't catch it)

Removing the anonymous namespace isn't enough because then we have a different ODR violation (this time with an actual linker warning about the multiple definitions of x). We need to mark x as inline

// FIXED example.h
#ifndef EXAMPLE_HDR_
#define EXAMPLE_HDR_

inline int x;

inline void inc_x() {
  ++x;
}

#endif

The same fix is correct for itertools: remove the anonymous namespace around chain and similar, and mark it as inline. Originally I had used the anonymous namespace to avoid multiple definitions because it was pre-c++-17 so inline variables didn't yet exist.