trungnt2910 / gdb-haiku

GNU General Public License v2.0
2 stars 0 forks source link

Main executable not relocated #4

Closed dalmemail closed 1 month ago

dalmemail commented 1 month ago

I built gdb from the gdb-15-haiku branch at this repository, since Debugger was giving me problems to debug QEMU.

Log:

~> /boot/home/gdb-haiku/gdb/gdb /boot/system/bin/qemu-system-x86_64
GNU gdb (GDB) 15.1.90.20240710-git
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-haiku".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/system/bin/qemu-system-x86_64...
(gdb) set args -m 512 -bios bios.bin-1.13.0 -display haiku -accel nvmm
(gdb) break nvmm_ram_block_added
Breakpoint 1 at 0x779c57: file ../target/i386/nvmm/nvmm-all.c, line 1138.
(gdb) run
Starting program: /boot/system/bin/qemu-system-x86_64 -m 512 -bios bios.bin-1.13.0 -display haiku -accel nvmm
Note: automatically using hardware breakpoints for read-only addresses.
Warning:
Cannot insert hardware breakpoint -1.
Could not insert hardware breakpoints:
You may have requested too many hardware breakpoints/watchpoints.

(gdb)
trungnt2910 commented 1 month ago

What does info target show?

Also, before run, can you do set debug haiku-nat on and send the output? Relocation of the main executable should have been fixed on the gdb-15-haiku branch.

dalmemail commented 1 month ago

Here you go: https://dalme.net/haiku/gdb_logs.tar.gz I've executed info target after run failed.

trungnt2910 commented 1 month ago
Native process:
[haiku-nat] pid_to_str: ptid=840.0.0
        Using the running image of child team 840 (qemu-system-x86_64).
        While running this, GDB does not access memory from...
Local exec file:
        `/boot/system/bin/qemu-system-x86_64', file type elf64-x86-64.
        Entry point: 0x44bb40
        0x0000000000000190 - 0x0000000000026428 is .hash
        0x0000000000026428 - 0x00000000000ab100 is .dynsym
        0x00000000000ab100 - 0x000000000014849a is .dynstr
        0x000000000014849a - 0x00000000001535ac is .gnu.version
        0x00000000001535b0 - 0x0000000000153700 is .gnu.version_r
        0x0000000000153700 - 0x0000000000440b60 is .rela.dyn
        0x0000000000440b60 - 0x0000000000447148 is .rela.plt
        0x0000000000447148 - 0x0000000000447165 is .init
        0x0000000000447170 - 0x000000000044b570 is .plt
        0x000000000044b570 - 0x000000000044b5d8 is .plt.got
        0x000000000044b5e0 - 0x0000000000b544a1 is .text
        0x0000000000b544a1 - 0x0000000000b544b9 is .fini
        0x0000000000b544c0 - 0x0000000000d499e1 is .rodata
        0x0000000000d499e4 - 0x0000000000d9a4e8 is .eh_frame_hdr
        0x0000000000d9a4e8 - 0x0000000000ee5818 is .eh_frame
        0x0000000000ee5818 - 0x0000000000ee5f2b is .gcc_except_table
        0x0000000000ee66e8 - 0x0000000000ee6848 is .tbss
--Type <RET> for more, q to quit, c to continue without paging--
        0x0000000000ee66e8 - 0x0000000000ee7700 is .init_array
        0x0000000000ee7700 - 0x0000000000ee7710 is .ctors
        0x0000000000ee7710 - 0x0000000000ee7720 is .dtors
        0x0000000000ee7720 - 0x000000000186c770 is .data.rel.ro
        0x000000000186c770 - 0x000000000186cb00 is .dynamic
        0x000000000186cb00 - 0x000000000186eff8 is .got
        0x000000000186f000 - 0x00000000019bba94 is .data
        0x00000000019bbaa0 - 0x00000000019dd020 is .bss
        0x00007fefa728e000 - 0x00007fefa7296000 is .text in commpage

Same issue again. The main executable is not relocated.

trungnt2910 commented 1 month ago

During startup, breakpoints should have been invalidated and disabled:

https://github.com/trungnt2910/gdb-haiku/blob/ec060364e76a41fef12e5d41324d2a38bbc980b1/gdb/haiku-nat.c#L102

After the target starts up or execs into a new image, the breakpoints should be re-enabled.

https://github.com/trungnt2910/gdb-haiku/blob/ec060364e76a41fef12e5d41324d2a38bbc980b1/gdb/haiku-nat.c#L682-L694

There must have been some (more) issues with the placement of these events. Would you mind sending me some more logs, this time with two new logging categories enabled:

set debug haiku-nat on
set debug infrun on
set debug breakpoint on
dalmemail commented 1 month ago

There you go: https://dalme.net/haiku/gdb_log_01_08_2024

trungnt2910 commented 1 month ago

Your executable is a bit special since it implements the GDB JIT Interface.

When GDB detects a JIT-aware executable, the JIT subsystem will trigger breakpoints during early startup, which is undesired.

Therefore, relocation of the main executable has to be pushed earlier. Furthermore, before relocation is attempted, target memory regions should be refreshed, since objfile_relocate also calls breakpoint_re_set and writes breakpoints.

With the latest code, GDB should be able to proceed further with your example, but then comes to an obscure bug somewhere further:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_S_construct null not valid

This behavior is not observed with my test executable and cat, so I'm conducting further investigation on qemu.

trungnt2910 commented 1 month ago

This exception was raised in source_cache::ensured:

bool
source_cache::ensure (struct symtab *s)
{
  std::string fullname = symtab_to_fullname (s);

symtab_to_fullname unexpectedly returned nullptr. This function only fills in a source file's name if it could not be opened by open_source_file, indicated by a non-negative FD.

Other functions in GDB expects that errno values are positive. They therefore do things like

  return scoped_fd (-ENOSYS);

to represent an invalid file descriptor.

Haiku, however, has negative errno values, making this return a positive "file descriptor" 2147459069, which actually means:

~/Desktop> error -2147459069
0x80006003: No such file or directory

We should somehow modify the build scripts to use the Haiku "hacks" for errno compliance with POSIX: Defining the B_USE_POSITIVE_POSIX_ERRORS flag and linking to the posix_error_mapper.

waddlesplash commented 1 month ago

I'm not sure how well the posix_error_mapper will work in a situation like this, and it may cause other problems if GDB tries to detect POSIX error codes on Haiku?

If there are too many places where errno is used with a - in front then we may have to do that, I guess...

trungnt2910 commented 1 month ago

If there are too many places where errno is used with a - in front then we may have to do that, I guess...

There are quite a few places, all related to scoped_fd.

and it may cause other problems if GDB tries to detect POSIX error codes on Haiku?

I don't know if GDB actually works with errnos. But we don't have to wait for that for GDB with posix_error_mapper to fail:

~/Desktop/work/gdb-haiku-build> gdb/gdb
sigaction: Invalid Argument.

On strace:

[  9741] read_stat(0xffffffff, "/boot/system/lib", false, 0x7f442bcdb8e0, 0x80) = 0x0 No error (175 us)
[  9741] set_signal_mask(0x0, (nil), [0x0]) = 0x0 No error (159 us)
[  9741] sigaction(0x1, (nil), 0x15c2c276e00) = 0x0 No error (75 us)
[  9741] sigaction(0x2, (nil), 0x15c2c276e20) = 0x0 No error (126 us)
[  9741] sigaction(0x3, (nil), 0x15c2c276e40) = 0x0 No error (83 us)
[  9741] sigaction(0x4, (nil), 0x15c2c276e60) = 0x0 No error (152 us)
[  9741] sigaction(0x5, (nil), 0x15c2c276e80) = 0x0 No error (75 us)
[  9741] sigaction(0x6, (nil), 0x15c2c276ea0) = 0x0 No error (109 us)
[  9741] sigaction(0x7, (nil), 0x15c2c276ec0) = 0x0 No error (75 us)
[  9741] sigaction(0x8, (nil), 0x15c2c276ee0) = 0x0 No error (106 us)
[  9741] sigaction(0x9, (nil), 0x15c2c276f00) = 0x80000005 Invalid Argument (84 us)
[  9741] resize_area(0x5885d, 0x2f0000) = 0x0 No error (131 us)
[  9741] resize_area(0x5885d, 0x370000) = 0x0 No error (80 us)
trungnt2910 commented 1 month ago

Should be solved when adding -DB_USE_POSITIVE_POSIX_ERRORS to CLFAGS during configure steps. Simply adding flags to gdb's configure.nat would not work since the broken code lies in gdbsupport:

  for (i = 1; i < NSIG; i++)
    {
      struct sigaction *oldact = &original_signal_actions[i];

      res = sigaction (i, NULL, oldact);
      if (res == -1 && errno == EINVAL)
    {
      /* Some signal numbers in the range are invalid.  */
      continue;
    }
      else if (res == -1)
    perror_with_name (("sigaction"));
trungnt2910 commented 1 month ago

image

Setting a breakpoint at qemu_init should now work.

trungnt2910 commented 1 month ago

Closing since the bug can no longer be reproduced and no further reports are made.