marzer / tomlplusplus

Header-only TOML config file parser and serializer for C++17.
https://marzer.github.io/tomlplusplus/
MIT License
1.58k stars 150 forks source link

Undefined references linker error on nvc++ #220

Open Tomcat-42 opened 8 months ago

Tomcat-42 commented 8 months ago

Environment

toml++ version and/or commit hash: v3.4.0

Compiler: nvc++ 23.11:

nvc++ 23.11-0 64-bit target on x86-64 Linux -tp tigerlake
NVIDIA Compilers and Tools
Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

C++ standard mode: 17, 20 and 23

Target arch: x64

Library configuration overrides:
None.

Relevant compilation flags:
None.

Describe the bug

Every time I try to use the toml::parse_file function I get those undefined references errors:

/usr/bin/ld: build/.objs/simulator/linux/x86_64/release/src/simulator/util/toml.cpp.o: in function `toml::v3::impl::impl_ex::parser::parse_table_header()':                                                                                                                                 
/home/pablo/.xmake/packages/t/toml++/v3.4.0/305579b325f940cbb9bfa2d2a8effc4b/include/toml++/impl/parser.inl:3269:(.text+0x1872c): undefined reference to `toml::v3::node::is_table() const'                                                                                                 
/usr/bin/ld: /home/pablo/.xmake/packages/t/toml++/v3.4.0/305579b325f940cbb9bfa2d2a8effc4b/include/toml++/impl/parser.inl:3269:(.text+0x18738): undefined reference to `toml::v3::node::is_array_of_tables() const'              

Steps to reproduce (or a small repro code sample)

./example.toml

key = "val"

./main.cpp:

#include <iostream>

#include "toml.hpp"

int main() {
  toml::table t = toml::parse_file("example.toml");
  std::cout << t << std::endl;
}

./toml.hpp: toml v3.4.0 header

Additional information

Perhaps is something related to #198?

marzer commented 8 months ago

Huh, this is a weird one. Are you using the library in "header-only" mode (the default if the header is standalone), or precompiled?

Tomcat-42 commented 8 months ago

I'm using in Header Only mode.

marzer commented 8 months ago

Would you mind testing the "precompiled" mode for me by disabling TOML_HEADER_ONLY? See the "Speeding up compilation" section of the main page for details.

Tomcat-42 commented 8 months ago

The error continues:

/usr/bin/ld: /tmp/nvc++dx_jAYPPLLB.o: in function `toml::v3::toml_formatter::print_inline(toml::v3::table const&)':
/home/pablo/test/toml.hpp:17045:(.text+0x29aba): undefined reference to `toml::v3::node::type() const'
/usr/bin/ld: /tmp/nvc++dx_jAYPPLLB.o: in function `toml::v3::toml_formatter::print(toml::v3::table const&)':
/home/pablo/test/toml.hpp:17137:(.text+0x29faf): undefined reference to `toml::v3::node::type() const'
/usr/bin/ld: /home/pablo/test/toml.hpp:17173:(.text+0x2a1bd): undefined reference to `toml::v3::node::type() const'
/usr/bin/ld: /home/pablo/test/toml.hpp:17186:(.text+0x2a2ac): undefined reference to `toml::v3::node::type() const'
/usr/bin/ld: /tmp/nvc++dx_jAYPPLLB.o: in function `toml::v3::json_formatter::print(toml::v3::table const&)':
/home/pablo/test/toml.hpp:17345:(.text+0x2ab13): undefined reference to `toml::v3::node::type() const'
/usr/bin/ld: /tmp/nvc++dx_jAYPPLLB.o:/home/pablo/test/toml.hpp:17510: more undefined references to `toml::v3::node::type() const' follow
/usr/bin/ld: /tmp/nvc++dx_jAYPPLLB.o: in function `toml::v3::impl::impl_ex::parser::parse_table_header()':
/home/pablo/test/toml.hpp:15714:(.text+0x38520): undefined reference to `toml::v3::node::is_table() const'
/usr/bin/ld: /home/pablo/test/toml.hpp:15714:(.text+0x3853d): undefined reference to `toml::v3::node::is_array_of_tables() const'
Tomcat-42 commented 8 months ago

I have created a docker image for reproducing my exact issue on my environment:

docker run -it tomcat0x42/toml-ex bash

And then, inside the container:

cd /home/builder/toml_ex/ && ./build 

(sorry for the image size, all things NVIDIAβ„’ are always bloated.)

marzer commented 8 months ago

Great! Very helpful, thanks. I'll try to make use of it soon.

Nvidia's toolchain has caused me nothing but misery, honestly.

Tomcat-42 commented 8 months ago

Nvidia's toolchain has caused me nothing but misery, honestly.

Yes, last week I had the most fierce battle of my entire career with a compiler (nvc++), and it was just a -cuda flag on the linker.

marzer commented 8 months ago

Haha, of course. What's some compiler bullshit without a missing (and woefully under-documented) flag?

Tomcat-42 commented 8 months ago

Hello @marzer , did you got a chance of looking at this issue?

marzer commented 8 months ago

@Tomcat-42 I've pulled your image and reproduced it for myself, but that's about all I've had time for. Have been travelling for work the last couple of weeks. Been hoping to make time for it some evening this week though :)

marzer commented 8 months ago

Oh, thought I was reproducing it. Just playing with it now and what I was actually seeing was that the way you've used TOML_HEADER_ONLY and TOML_IMPLEMENTATION in the image isn't correct; you've put TOML_IMPLEMENTATION after the point at which toml++ is transitively included, so from it's perpsective the implementation is always disabled (because TOML_HEADER_ONLY has been set to 0 beforehand).

Doesn't solve the issue, after addressing it I am seeing the weird stuff you are above πŸ˜… Just thought you may want to double-check that against any "concrete" implementations you have on the go.

marzer commented 8 months ago

Curiously, every function exhibiting this issue is pure virtual in the base. Looks like some vtable shenanigans.

marzer commented 7 months ago

@Tomcat-42 Right, after much hoop-jumping trying to get a working version of nvhpc installed on my linux laptop, I had some time to experiment with this a bit more comprehensively. Can you try the workaround I've implemented in 1f7884e59165e517462f922e7b6de131bd9844f3?

Tomcat-42 commented 7 months ago

The linker errors are gone, thanks @marzer.

But sadly I'm running in another issue:

# doesn't work in nvc++
~/toml_ex: nvc++ main.cpp -o main -I./tomlplusplus/include
~/toml_ex: ./main
Segmentation fault: oops, process './main' core dumped
Error: nu::shell::external_command

  Γ— External command failed
   ╭─[entry #1:1:1]
 1 β”‚ ./main
   Β· ───┬──
   Β·    ╰── core dumped
   ╰────
  help: Segmentation fault: child process './main' core dumped

# Works in clang
~/toml_ex: clang++ main.cpp -o main -I./tomlplusplus/include
~/toml_ex: ./main
[dependencies]
cpp = 17

[library]
authors = [ 'Mark Gillard <mark.gillard@outlook.com.au>' ]
name = 'toml++'

Every use of the formatter function causes a segfault:

(gdb) start
Temporary breakpoint 1 at 0x40764f: file main.cpp, line 5.
Starting program: /home/pablo/toml_ex/main
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Temporary breakpoint 1, main () at main.cpp:5
5         toml::table tbl;
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00000000004290e4 in toml::v3::impl::formatter::print_value (this=0x7fffffffdff0, val_node=0x47a9b0,
    type=toml::v3::_ZN4toml2v39node_type4noneE) at ./tomlplusplus/include/toml++/impl/formatter.inl:483
483                     switch (type)
(gdb)

Did you have any clue why this is happening?

Tomcat-42 commented 7 months ago

The full coredump info (If might help):

~/toml_ex: coredumpctl info
           PID: 13931 (main)
           UID: 1000 (pablo)
           GID: 1000 (pablo)
        Signal: 11 (SEGV)
     Timestamp: Tue 2024-03-19 09:53:18 -03 (7min ago)
  Command Line: ./main
    Executable: /home/pablo/toml_ex/main
 Control Group: /user.slice/user-1000.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-1000.slice
       Session: 1
     Owner UID: 1000 (pablo)
       Boot ID: b0633ef9d5934402b7d93c7e86ab335d
    Machine ID: ab53e628f8e84bc78df971ff119e82df
      Hostname: inspiron
       Storage: /var/lib/systemd/coredump/core.main.1000.b0633ef9d5934402b7d93c7e86ab335d.13931.1710852798000000.zst (present)
  Size on Disk: 113.1K
       Message: Process 13931 (main) of user 1000 dumped core.

                Module /home/pablo/toml_ex/main without build-id.
                Module /home/pablo/toml_ex/main
                Module libnvcpumath.so without build-id.
                Module libnvomp.so without build-id.
                Stack trace of thread 13931:
                #0  0x0000000000426154 n/a (/home/pablo/toml_ex/main + 0x26154)
                ELF object binary architecture: AMD x86-64
~/toml_ex: coredumpctl debug match
No match found.
~/toml_ex: coredumpctl debug                                                                                                                1
           PID: 13931 (main)
           UID: 1000 (pablo)
           GID: 1000 (pablo)
        Signal: 11 (SEGV)
     Timestamp: Tue 2024-03-19 09:53:18 -03 (8min ago)
  Command Line: ./main
    Executable: /home/pablo/toml_ex/main
 Control Group: /user.slice/user-1000.slice/session-1.scope
          Unit: session-1.scope
         Slice: user-1000.slice
       Session: 1
     Owner UID: 1000 (pablo)
       Boot ID: b0633ef9d5934402b7d93c7e86ab335d
    Machine ID: ab53e628f8e84bc78df971ff119e82df
      Hostname: inspiron
       Storage: /var/lib/systemd/coredump/core.main.1000.b0633ef9d5934402b7d93c7e86ab335d.13931.1710852798000000.zst (present)
  Size on Disk: 113.1K
       Message: Process 13931 (main) of user 1000 dumped core.

                Module /home/pablo/toml_ex/main without build-id.
                Module /home/pablo/toml_ex/main
                Module libnvcpumath.so without build-id.
                Module libnvomp.so without build-id.
                Stack trace of thread 13931:
                #0  0x0000000000426154 n/a (/home/pablo/toml_ex/main + 0x26154)
                ELF object binary architecture: AMD x86-64

GNU gdb (GDB) 14.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/pablo/toml_ex/main...
[New LWP 13931]
Core was generated by `./main'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000426154 in toml::v3::impl::array_iterator::operator* (
    this=<error reading variable: Cannot access memory at address 0xfffffffffffffff9>) at ./tomlplusplus/include/toml++/impl/array.hpp:111
111                             return *iter_->get();
(gdb) thread apply all backtrace full

Thread 1 (LWP 13931):
#0  0x0000000000426154 in toml::v3::impl::array_iterator::operator* (this=<error reading variable: Cannot access memory at address 0xfffffffffffffff9>) at ./tomlplusplus/include/toml++/impl/array.hpp:111
No locals.
Backtrace stopped: Cannot access memory at address 0x9
(gdb)
marzer commented 7 months ago

Did you have any clue why this is happening?

I have absolutely no idea. None of this makes any sense to me πŸ˜…

Thanks for the follow-up. I'll try to give it a look today.

marzer commented 6 months ago

Ok, coming back to this. I spent quite a bit of time trying to get nvc++ to behave, and nothing I did got it any further. I have no idea why this is happening. Feels like there's a compiler bug somewhere, because clang, msvc and gcc all consume the library OK. Not really sure how to advance this, short of escalating it with nvidia folks?