marzer / tomlplusplus

Header-only TOML config file parser and serializer for C++17.
https://marzer.github.io/tomlplusplus/
MIT License
1.47k stars 141 forks source link

Parsing large integer value invokes undefined behaviour #188

Closed ghost closed 1 year ago

ghost commented 1 year ago

Environment

toml++ version and/or commit hash:
version 3.2.0, commit 698285d9b2f3f6756fcdab8b93f60352325764e1 on master branch

Compiler:
GCC 10.2.1

C++ standard mode:
-std=c++17

Target arch:
x86_64

Library configuration overrides:
none

Relevant compilation flags:
-std=c++17 -O2 -fsanitize=undefined -fsanitize=address

Describe the bug

When parsing the integer value -9223372036854775808 (i.e. -2^63), the parser constructs an intermediate uint64_t value 9223372036854775808. It then casts this value to int64_t, which has implementation defined behaviour and results in -9223372036854775808. It then attempts to multiply this value by -1 which has undefined behaviour because the result can not be represented as int64_t.

Steps to reproduce (or a small repro code sample)

Compile as follows: g++ -Iinclude -Wall -Wextra -Wpedantic -Werror -std=c++17 -O2 -fsanitize=undefined -fsanitize=address -o tt_decoder toml-test/tt_decoder.cpp

Then run tt_decoder on the following TOML document:

v = -9223372036854775808

The program prints the following error message:

include/toml++/impl/parser.inl:2272:44: runtime error: signed integer overflow: -1 * -9223372036854775808 cannot be represented in type 'long int'

Additional information

marzer commented 1 year ago

Nice find! Will look into this later today :)

marzer commented 1 year ago

Sorry, ran out of time. Will have a look during the week.

marzer commented 1 year ago

Looking into this now. There's already checks where I'm rejecting numbers that are unrepresentable, just that there's a gap in the sign handling logic for exactly INT64_MIN 😅

ghost commented 1 year ago

You're right :smile: I'm almost sorry for reporting it. But it's still worthwhile to eliminate an undefined codepath I guess.

marzer commented 1 year ago

Eh, I'm happy you did. It's probably not that uncommon a value in the wild since people often use INT_MIN/MAX as sentinels for various things.