akheron / jansson

C library for encoding, decoding and manipulating JSON data
http://www.digip.org/jansson/
Other
3.05k stars 811 forks source link

build: Add a symbol version to all exported symbols #540

Closed smcv closed 4 years ago

smcv commented 4 years ago

The --default-symver linker option attaches a default version definition (the SONAME) to every exported symbol. It is supported since at least GNU binutils 2.22 in 2011 (older versions not tested).

With this version definition, newly-linked binaries that depend on the jansson shared library will refer to its symbols in a versioned form, preventing their references from being resolved to a symbol of the same name exported by json-c or json-glib if those libraries appear in dependency search order before jansson, which will usually result in a crash. This is necessary because ELF symbol resolution normally uses a single flat namespace, not a tree like Windows symbol resolution. At least one symbol (json_object_iter_next()) is exported by all three JSON libraries.

Linking with -Bsymbolic is not enough to have this effect in all cases, because -Bsymbolic only affects symbol lookup within a shared object, for example when parse_json() calls json_decref(). It does not affect calls from external code into jansson, unless jansson was statically linked into the external caller.

This change will also not prevent code that depends on json-c or json-glib from finding jansson's symbols and crashing; to prevent that, a corresponding change in json-c or json-glib would be needed.

Adding a symbol-version is a backwards-compatible change, but once added, removing or changing the symbol-version would be an incompatible change that requires a SONAME bump.

Resolves: https://github.com/akheron/jansson/issues/523 (when combined with an equivalent change to json-c).


I am not a jansson expert, please review and test carefully.

coveralls commented 4 years ago

Coverage Status

Coverage increased (+0.7%) to 96.154% when pulling ca80d5127e56dd2ee056fdf3f4204017b78b89e0 on smcv:default-symver into 52dfc3dd4a766fc2f1a19edad5d5cfdd21e0ec27 on akheron:master.

akheron commented 4 years ago

Could you also add the same changes for CMake?

smcv commented 4 years ago

Sorry, I didn't notice the parallel build system. I tried to make the equivalent change in json-c's build system (json-c/json-c#639) but I'm not sure how to do that in a way that is compatible with old CMake. Would a CMake 3.13 dependency be acceptable?

ploxiln commented 4 years ago

another idea for CMake is something like:

if (CMAKE_VERSION VERSION_GREATER_EQUAL 3.13)
    target_link_options(...)
endif()

(not sure though)

smcv commented 4 years ago
if (CMAKE_VERSION VERSION_GREATER_EQUAL 3.13)

That's a bad idea here. Adding symbol versions is a backwards-compatible but not forwards-compatible ABI change (just like adding a new function), so if builds of jansson version x.y.z for Linux + glibc have symbol versions, then all builds of version >= x.y.z need to have them.

(Linux with glibc has supported versioned symbols for a very long time, but I'm not so sure what happens with non-GNU Linux systems like Android and musl. In the Autotools implementation I've only enabled versioned symbols for GNU/Linux, i.e. Linux with glibc, but I'm not sure how you spell that in CMake.)

ploxiln commented 4 years ago

presumably that condition would be true for any linux distro still building updated packages, and false for the old embedded buildsystems that compatibility with old cmake is being kept for

ploxiln commented 4 years ago

(but if that doesn't work with musl libc, I don't know how to detect and avoid that)

mbiebl commented 4 years ago

what happens if you build with musl and that link flag? Does it error out or ignore it?

akheron commented 4 years ago

Jansson is not explicitly retaining compatibility with old cmake. I’m fine requiring a newer version of it.

karelzak commented 4 years ago

The default symbols versioning is a poor man solution. It's better to start with explicit symbols versioning. We use symbols versioning in libblkid, libmount, libfdisk, libsmartcols, libcryptsetup etc for more than 10 years and the number of bug reports related to conflicts or compatibility is zero. (EDIT: I don't think you want to run applications compiled against a new lib version with old lib version. Note also that packaging systems like RPM are able to follow symbols versioning and packages dependencies are maintained on this level rather than on soname).

Copy & past my comment from json-c:

Guys, symbols versioning is the very basic thing you need in an arbitrary serious shared library. Really.

It's the way how to maintain backward compatibility, avoid soname changes, and how to avoid symbols conflicts. And it's very very simple, all you need is one file.

Example: https://github.com/karelzak/util-linux/blob/master/libmount/src/libmount.sym

You also want to read Ulrich's excellent shared libs how-to (3.3. ABI versioning): https://software.intel.com/sites/default/files/m/a/1/e/dsohowto.pdf

smcv commented 4 years ago

If the maintainers of jansson are willing to maintain full symbol-versioning as @karelzak suggests, I can redo this merge request to do that. However, that has an ongoing maintenance requirement (each new symbol added to the library needs to be added to the file listing the symbols), so I am not going to propose that unless the maintainers of jansson will take responsibility for doing so in future releases. Please let me know which.

I deliberately proposed this simpler version in the hope that it would maximize the chance of getting some sort of symbol-versioning, that is at least sufficient to disambiguate between the three JSON libraries, adopted by each of them.

akheron commented 4 years ago

We already have a list of all exported symbols in src/jansson.def, and the test suite checks that the file lists all json_ and jansson_ symbols. Could that be leveraged somehow?

smcv commented 4 years ago

We already have a list of all exported symbols in src/jansson.def, and the test suite checks that the file lists all json and jansson symbols. Could that be leveraged somehow?

Sure, if you want to. If you want to do the form of symbol versioning that @karelzak advocates, then the information it needs is basically a list of (symbol, version) pairs. That's strictly more information than src/jansson.def, so you could generate src/jansson.def from that list instead of maintaining it by hand and committing it, or the test suite could assert that every symbol in src/jansson.def also exists in your equivalent of libmount's libmount.sym.

There are lots of possible ways to implement this, and I don't know this library or your maintenance workflow or requirements well enough to say that one is inherently better than the others.

For the collision between jansson, json-c and json-glib, what matters is that each of the three libraries has some sort of symbol-versioning - either the basic form I proposed here where the symbol version resembles the SONAME (as used in libpng and libtiff), or the more elaborate form where the version that introduced each symbol is tracked (as used in for example libmount, libgcab and OpenSSL).

akheron commented 4 years ago

Let’s go with the —default-symver option. Ease of maintaining is the highest priority. There have been very few changes to the library in the last years anyway.

CMake config will still need to be updated. I’m willing to merge and release a new version after that.

smcv commented 4 years ago

CMake config will still need to be updated

Does anyone reading this happen to know whether there's a nice way to ask "is this glibc?" in a CMake build system? (Sorry, I'm not a CMake expert.)

smcv commented 4 years ago

Let’s go with the —default-symver option. Ease of maintaining is the highest priority.

OK!

If you later decide this was the wrong decision, and want to go with the full version tracking as Karel recommended, then it will still be possible to do that in a backwards-compatible way. You'd do that by using your SONAME as the version that is used for symbols older than the version where you do the switchover, and a detailed version number for newer symbols. So if we assume you switch from --default-symver to detailed versioning in version 23.0, you'd get something like:

libjansson.so.4 {
  global:
    ... all the symbols you had before version 23 ...
}

LIBJANSSON_23.0 {
  global:
    jansson_some_symbol_added_in_v23_0;
    jansson_other_symbol_added_in_v23_0;
} libjansson.so.4;

LIBJANSSON_23.1 {
  global:
    jansson_new_symbol_added_in_v23_1;
} LIBJANSSON_23.0;

and so on.

pinotree commented 4 years ago

Does anyone reading this happen to know whether there's a nice way to ask "is this glibc?" in a CMake build system?

include(CheckSymbolExists)
check_symbol_exists("__GLIBC__" "stdlib.h" LIBC_IS_GLIBC)
if (LIBC_IS_GLIBC)
  # do something for glibc
endif ()
smcv commented 4 years ago

what happens if you build with musl and that link flag? Does it error out or ignore it?

Honestly, I have no idea. If I can get it to work in CMake, I think the ideal thing right now would be to detect either GNU/Linux or GNU/anything (i.e. glibc), use versioned symbols there, and not use versioned symbols anywhere else; and then if people who use another libc or another kernel want to opt-in to having versioned symbols, it's their responsibility to test it and supply a working build-system change.

smcv commented 4 years ago

If I can get it to work in CMake, I think the ideal thing right now would be to detect either GNU/Linux or GNU/anything (i.e. glibc)

Done. I'm now using a check for the __GLIBC__ macro in both cases (thanks, @pinotree!) in the hope that this will ensure they both give the same answer.

karelzak commented 4 years ago

If you later decide this was the wrong decision, and want to go with the full version tracking as Karel recommended, then it will still be possible to do that in a backwards-compatible way. You'd do that by using your SONAME as the version that is used for symbols older than the version where you do the switchover, and a detailed version number for newer symbols. So if we assume you switch from --default-symver to detailed versioning in version 23.0, you'd get something like:


libjansson.so.4 {
  global:
    ... all the symbols you had before version 23 ...
}

Note that it's a good idea to explicitly mark also all other symbols as local, it means:


libjansson.so.4 {
   global:
     ... all the symbols you had before version 23 ...
   local:
     *;
};
``
cgwalters commented 2 years ago

So this change is now in RHEL8, and rpm-ostree has a new hard dependency on the versioned symbols. I was curious about this and found this PR.

And looking through it...I just want to say thank you :pray: ! I didn't know that all 3 json C libraries had a colliding symbol :scream:

I'm going to hide this bit in a detail but...stuff like this just so strongly reinforces my believe that we need to be draining more C code into Rust. First, correctly handling JSON parsing in C is just hard and ugly - memory safety is hugely important. And further of course Rust namespaces all symbols and handles multiple versions of a dependency in the system gracefully.
xfliu2665 commented 4 months ago

great jobs done! but for some reasons i can't update the lib-ver, that's to say i have to use jansson-2.11 which has not merged the feature yet.

my problem is that i got a func named json_object in static-lib(i can't change its name for some reasons as well), which is conflicted with jansson-lib(i use it as a dynamic-lib). i wonder is there any to make sure json_object in jansson-lib is called first under the case not invade my code.