bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
9.03k stars 9.22k forks source link

[bitnami/redis] Profiling support with redis image (frame pointers, debug symbols) #8071

Closed igorwwwwwwwwwwwwwwwwwwww closed 2 years ago

igorwwwwwwwwwwwwwwwwwwww commented 3 years ago

Which chart: bitnami/redis 15.5.2

The image is bitnami/redis:6.2.6-debian-10-r14.

FWIW, I wasn't sure whether to open this issue here or on bitnami-docker-redis. I'd be happy to move it over there if that's the correct place.

Is your feature request related to a problem? Please describe. The redis image is compiled without frame pointers, which means perf profiling is not able to walk the stack, producing broken userspace stacks in profiles.

Example:

sudo perf record -ag -F499 -- sleep 30
sudo perf script --header > perf.script.out
cat perf.script.out | stackcollapse-perf.pl --kernel | grep redis | flamegraph.pl --hash --colors=perl > perf.script.svg

Produces a flamegraph that looks like this:

Screenshot 2021-11-09 at 13 41 34

Source: perf.script.gke-redis-chart-sandbox-redis-c2-5d662c51-7305.20211109_121927.only_redis.svg.gz.

As you can see, we have no usable userspace stacks. These profiles have been invaluable in optimizing our production workloads, as well as pinpointing and contributing performance fixes to upstream redis.

Examples include:

Describe the solution you'd like

For our current (non docker) redis, we are compiling redis ourselves with custom flags.

The main thing we added to the build process is:

env['CFLAGS'] << ' -fno-omit-frame-pointer'

A similar thing should (hopefully) be relatively easy to add to your build pipeline. And is something that could be worth considering for other images as well (we add this to our builds of ruby and git, for example).

To quote @brendangregg (source):

Always compile with frame pointers. Omitting frame pointers is an evil compiler optimization that breaks debuggers, and sadly, is often the default. Without them, you may see incomplete stacks from perf_events (...).

Describe alternatives you've considered Stack walking via DWARF. It doesn't need frame pointers, but it's more expensive, and the profiles are much bigger (like 10x). And also many tools (e.g. bpftrace) do not support it.

Additional context Our downstream issue is: https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/1350.

javsalgar commented 3 years ago

Thank you so much for the information! I think this is something worth adding to the compilation recipe if it helps optimizing the whole solution. I will forward this to the engineering team. As soon as we have more news, I will update this ticket.

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

@javsalgar Is there any update here?

This is a big blocker for our (@gitlab) adoption of the bitnami redis chart. We would very much like to avoid having to build our own images if possible. I'd be happy to answer any questions about this change if needed.

carrodher commented 2 years ago

Hi, I just checked the internal task and there is no news, I just sent it to the proper team and they will review the priority. I hope to be able to provide an answer during the following week(s)

javsalgar commented 2 years ago

Hi,

Just a note to let you know that our latest redis container was compiled with the -fno-omit-frame-pointers flag. Please let us know if you find any other issue.

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

@javsalgar Hi there, thanks for the update!

I tried to get another profile, and we have partial symbols which is already quite good.

Screenshot 2022-01-06 at 17 03 13

However, some symbols were missing, which I believe happens because the binary is stripped:

$ file /proc/1127296/root/opt/bitnami/redis/bin/redis-server

/proc/1127296/root/opt/bitnami/redis/bin/redis-server: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=526e6d79d9abbde85727f7e0474462fe1d2593d1, stripped

I understand that this may be a slightly harder sell, but would it be possible to get an unstripped binary? In our experience this does increase the binary size quite a bit, but the diagnostic information we get out of it makes it well worth it.

Symbols also help diagnosing crashes, should they occur.

Alternatively, if we can get the unstripped binary or debug symbols available separately, then we can import them into perf ourselves via perf buildid-cache --add. Since we only need them at the post-processing stage.

javsalgar commented 2 years ago

Hi,

That's something we perform by default in all our compilations. I will forward this request to the engineering team for evaluation, but in principle we wouldn't want to perform substantial increases in our container size.

matt-smiley commented 2 years ago

in principle we wouldn't want to perform substantial increases in our container size.

@javsalgar agreed, and that is certainly achievable.

There are 2 useful options to consider. The first option covers the most common profiling use-cases and is extremely cheap (0.2% larger image). The 2nd option covers additional use-cases and costs just 3% larger image. I'd be delighted with the 2nd option, but if that's too expensive, the 1st option would also be fantastically useful.

Also, for reference, these numbers are specific to redis, but the approach and trade-offs are generic, so I'd ask that your build team consider doing a general policy shift to use --strip-debug instead of --strip-all as a standard practice. This helps enormously with profiling and costs almost nothing. Very much like the initial request to enable frame-pointers.

Thank you for helping to move this request along!

Use strip-debug:

$ objcopy --strip-debug redis-server

or

$ strip --strip-debug redis-server

or

$ gcc -g0 ...

This option adds very little size to the binary and satisfies the most common use cases (i.e. profiling and dynamic instrumentation of function entry/exit). This build option strips DWARF debug sections from the binary, but it preserves the static symbol names, which profilers like perf and BPF can use to resolve function names.

This option only requires adding 202 KB to your image size (before compression). That is 0.2% of your current image size (91 MB). I think this is easily worth the benefit.

Use compress-debug-sections:

$ objcopy --compress-debug-sections redis-server

or

$gcc -gz ...

This option preserves all of the DWARF debug sections but compresses them in the binary using zlib.

In addition to the above use-cases, this option supports whatever level of debug info you specified in gcc -g[level], potentially down to the line level of precision. This is the most versatile option for debugging and profiling.

This option adds 2.9 MB to the size of the redis-server binary, growing your docker image by 3% (from 91 MB to 94 MB).

Size comparison:

For reference, here is a comparison of the sizes of the redis-server binary:

$ objcopy redis-server redis-server.00_full_debug
$ objcopy --compress-debug-sections redis-server redis-server.01_compress_debug
$ objcopy --strip-debug redis-server redis-server.02_strip_debug
$ objcopy --strip-all redis-server redis-server.03_strip_all

$ ls -lh redis-server*
-rwxr-xr-x 1 msmiley msmiley  12M Sep 29 09:52 redis-server
-rwxrwxr-x 1 msmiley msmiley  12M Jan 10 13:06 redis-server.00_full_debug
-rwxrwxr-x 1 msmiley msmiley 4.8M Jan 10 13:06 redis-server.01_compress_debug
-rwxrwxr-x 1 msmiley msmiley 2.1M Jan 10 13:06 redis-server.02_strip_debug
-rwxrwxr-x 1 msmiley msmiley 1.9M Jan 10 13:07 redis-server.03_strip_all

$ file redis-server*
redis-server:                   ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a28864b3f4b3eedecb95009775f164308783a913, for GNU/Linux 3.2.0, with debug_info, not stripped
redis-server.00_full_debug:     ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a28864b3f4b3eedecb95009775f164308783a913, for GNU/Linux 3.2.0, with debug_info, not stripped
redis-server.01_compress_debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a28864b3f4b3eedecb95009775f164308783a913, for GNU/Linux 3.2.0, with debug_info, not stripped
redis-server.02_strip_debug:    ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a28864b3f4b3eedecb95009775f164308783a913, for GNU/Linux 3.2.0, not stripped
redis-server.03_strip_all:      ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a28864b3f4b3eedecb95009775f164308783a913, for GNU/Linux 3.2.0, stripped
javsalgar commented 2 years ago

Thank you so much for the detailed report! I will forward this for the engineering team. I cannot guarantee an ETA, but as soon as we have news we will update the ticket.

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

@javsalgar Gentle nudge :) Is there any update on this front? We'd be happy to provide more information if needed. Thanks!

javsalgar commented 2 years ago

Hi! @miguelaeh is working on this. As soon as he has more information he will update this ticket.

miguelaeh commented 2 years ago

Hi guys, I did a test on this. Not stripping the binaries will increase by around 21 MB in the Redis bin folder. Which will increase our image size by 22%. We will discuss this with the team and let you know if that's assumable or not.

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

@miguelaeh Thanks for the update! What is the breakdown of sizes per binary? Does that include redis-cli as well?

The main one we're concerned with is redis-server, stripping the others would probably be fine.

The other question is: Did you end up using one of the compression options suggested above by @matt-smiley? The tests we ran with symbol compression suggested much smaller increase than the 21MB you got.

It should be possible to get significantly lower size overhead by compiling with gcc -gz (docs).

matt-smiley commented 2 years ago

@miguelaeh like @igorwwwwwwwwwwwwwwwwwwww said, we'd expect to see a much smaller size increase if you either compress or strip the debug-info sections in the binaries. You can do that either:

If that doesn't give much smaller sizes, could you share the binaries so we can see where the space is spent?

miguelaeh commented 2 years ago

Hi guys, I built it without stripping any binaries. If just leaving out of the strip redis-server is enough the size increase should be smaller. I will come back with more information. In any case, looks like it is assumable in our side since it would be less than 21MB

miguelaeh commented 2 years ago

Hi guys, I did some more tests.

  1. Skipping the strip only for redis-server. This produces an increase of 9.8 MB.
  2. Skipping the strip only for redis-server plus adding -gz. Produced the same size than 1.
  3. Skipping the strip only for redis-server plus adding -g0. Produced just an increase of 0.5 MB.
  4. Skipping the strip only for redis-server plus adding both -gz and -g0. This curiously produced a higher size than point 1. It increased by 14.8 MB.

So taking that into account, looks like point 3 would be the preferred one. WDYT?

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

Oh, that result for 2. is really surprising! I saw in the gcc docs for -gz:

Produce compressed debug sections in DWARF format, if that is supported. If type is not given, the default type depends on the capabilities of the assembler and linker used. type may be one of ‘none’ (don’t compress debug sections), ‘zlib’ (use zlib compression in ELF gABI format), or ‘zlib-gnu’ (use zlib compression in traditional GNU format). If the linker doesn’t support writing compressed debug sections, the option is rejected. Otherwise, if the assembler does not support them, -gz is silently ignored when producing object files.

So I wonder if that is what may be happening here.

Point 3 would be absolutely acceptable for us, but getting compression working would be even better.

As an alternate way of performing option 2, could you try compiling without -gz and then running objcopy --compress-debug-sections redis-server after the binary is built?

Many thanks for your work on this!

miguelaeh commented 2 years ago

Removing -gz and using objcopy I obtain an increase of 9.9 MB. So, still, the best option is option 2 commented above.

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

@miguelaeh Thanks for the testing, in this case let's go with option 3 (I believe this is the one you were referring to). This should cover our needs for CPU profiling, and comes at a very low cost.

matt-smiley commented 2 years ago

@miguelaeh would you mind another experiment? I think the compression option is silently failing, but I'd like to verify that.

Here's what I get when I locally build redis on a Linux box. If you don't mind, I'd like you to repeat this and share the readelf output before versus after trying to compress the debug sections in the binary.

# Make a fresh build of redis-server.
# Using default build options produces a binary with both symbols and debug sections.

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ make clean
...

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ make redis-server
...
    LINK redis-server

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ file redis-server 
redis-server: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=968fd0cd6e00bebc597f538c8e2ee11a318d4c71, for GNU/Linux 3.2.0, with debug_info, not stripped

# Show the size of the unstripped and uncompressed binary.

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ ls -lh redis-server 
-rwxrwxr-x 1 msmiley msmiley 13M Feb 16 23:25 redis-server

# Show the debug sections.
# We can see they are not compressed because they lack the "C" flag and their section names do not start with ".zdebug".
# For reference, show the symbol table and its associated strings table too, since that is the most important piece to not strip.

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ readelf --sections --wide redis-server | grep -e 'Size' -e 'debug' -e 'SYMTAB' -e 'STRTAB'
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 7] .dynstr           STRTAB          000000000001b228 01b228 00f491 00   A  0   0  1
  [30] .debug_aranges    PROGBITS        0000000000000000 2101db 0018f0 00      0   0  1
  [31] .debug_info       PROGBITS        0000000000000000 211acb 34d5ae 00      0   0  1
  [32] .debug_abbrev     PROGBITS        0000000000000000 55f079 02530d 00      0   0  1
  [33] .debug_line       PROGBITS        0000000000000000 584386 145465 00      0   0  1
  [34] .debug_str        PROGBITS        0000000000000000 6c97eb 0577bc 01  MS  0   0  1
  [35] .debug_loc        PROGBITS        0000000000000000 720fa7 3e792b 00      0   0  1
  [36] .debug_ranges     PROGBITS        0000000000000000 b088d2 0d4e30 00      0   0  1
  [37] .debug_macro      PROGBITS        0000000000000000 bdd702 01d3fd 00      0   0  1
  [38] .symtab           SYMTAB          0000000000000000 bfab00 01ed08 18     39 1733  8
  [39] .strtab           STRTAB          0000000000000000 c19808 017e44 00      0   0  1
  [40] .shstrtab         STRTAB          0000000000000000 c3164c 000194 00      0   0  1

# Now rewrite the binary with compressed debug sections.

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ objcopy --compress-debug-sections redis-server 

# Show the binary is smaller.

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ ls -lh redis-server 
-rwxrwxr-x 1 msmiley msmiley 5.3M Feb 16 23:29 redis-server

# Show the debug sections are smaller and now have flag "C" (compressed using zlib).

msmiley@saoirse:~/src/git/public/redis/src [6.2.6|✔] $ readelf --sections --wide redis-server | grep -e 'Size' -e 'debug' -e 'SYMTAB' -e 'STRTAB'
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 7] .dynstr           STRTAB          000000000001b228 01b228 00f491 00   A  0   0  1
  [30] .debug_aranges    PROGBITS        0000000000000000 2101e0 000690 00   C  0   0  8
  [31] .debug_info       PROGBITS        0000000000000000 210870 1a1748 00   C  0   0  8
  [32] .debug_abbrev     PROGBITS        0000000000000000 3b1fb8 005f24 00   C  0   0  8
  [33] .debug_line       PROGBITS        0000000000000000 3b7ee0 06ba4f 00   C  0   0  8
  [34] .debug_str        PROGBITS        0000000000000000 423930 01cb67 01 MSC  0   0  8
  [35] .debug_loc        PROGBITS        0000000000000000 440498 0adb1b 00   C  0   0  8
  [36] .debug_ranges     PROGBITS        0000000000000000 4edfb8 01b778 00   C  0   0  8
  [37] .debug_macro      PROGBITS        0000000000000000 509730 00ae85 00   C  0   0  8
  [38] .symtab           SYMTAB          0000000000000000 5145b8 01ed08 18     39 1733  8
  [39] .strtab           STRTAB          0000000000000000 5332c0 017e44 00      0   0  1
  [40] .shstrtab         STRTAB          0000000000000000 54b104 000194 00      0   0  1
miguelaeh commented 2 years ago

Hi @matt-smiley , How much would that reduce the size? The mentioned option 2 was already increasing just 0.5MB, so which kind of extra advantage brings building it on that way?

matt-smiley commented 2 years ago

@miguelaeh great questions! Notes for both are below.

How much would that reduce the size?

Compressing the debug sections would reduce the unstripped redis-server binary from 12.2 MB to 5.3 MB.

More details are at the end, in case you want to play with this some more.

The mentioned option 2 was already increasing just 0.5MB, so which kind of extra advantage brings building it on that way?

There are 2 complimentary sets of benefits we can get, summarized below.

The smaller option (preserving symbol tables) provides names for most of the functions, but nothing else. This is incredibly useful for performance profiling, and as you know, it costs very little extra size on disk (and no extra size in memory at runtime). This is the main thing that @igorwwwwwwwwwwwwwwwwwwww and I want, because it preserves the ability to observe where Redis is spending its time while under specific production workloads, which has let us make efficiency improvements to both Redis itself and our client application behavior.

The larger option (preserving debug sections) includes the above benefits and also adds much more precise information about the binary's relationship to the original source code. Some examples of what it supports:

Essentially the symbol table just provides the starting address and name of most functions, whereas the debug sections provide a much more detailed and accurate mapping of machine instructions to source code. With some kinds of analysis, we don't need the higher level of precision, but with other kinds of analysis, it can save a huge amount of time and guess-work. It really depends on what the engineer is trying to do. Just preserving the symbol tables is a great improvement. If we can also provide the debug sections without costing too much extra size (i.e. by compressing them), I think that's worth including too, even though in my own work I usually only need the symbol tables.

That's why I lean towards compressing (rather than stripping) the debug sections. But I would be very happy with either.


Supplemental details on the size comparison.

Since the Bitnami image already saves space by using symlinks for redis-sentinel and friends, there are only 3 Redis binaries in the image. The most important one is redis-server.

For redis-server the size options are:

The attached script clones and builds Redis from source and then uses objcopy, so we can see all 4 of the above options.

build_redis_and_compare_binary_sizes.sh.gz

Results:

Size comparison for redis-server:
-rwxrwxr-x 1 msmiley msmiley 12788184 Feb 17 10:45 redis-server.full
-rwxrwxr-x 1 msmiley msmiley  5553048 Feb 17 10:45 redis-server.compress-debug
-rwxrwxr-x 1 msmiley msmiley  2384592 Feb 17 10:45 redis-server.strip-debug
-rwxrwxr-x 1 msmiley msmiley  2165440 Feb 17 10:45 redis-server.strip-all

Size comparison for redis-cli:
-rwxrwxr-x 1 msmiley msmiley 7183672 Feb 17 10:45 redis-cli.full
-rwxrwxr-x 1 msmiley msmiley 2594464 Feb 17 10:45 redis-cli.compress-debug
-rwxrwxr-x 1 msmiley msmiley  976168 Feb 17 10:45 redis-cli.strip-debug
-rwxrwxr-x 1 msmiley msmiley  900352 Feb 17 10:45 redis-cli.strip-all

Size comparison for redis-benchmark:
-rwxrwxr-x 1 msmiley msmiley 7265776 Feb 17 10:45 redis-benchmark.full
-rwxrwxr-x 1 msmiley msmiley 2892544 Feb 17 10:45 redis-benchmark.compress-debug
-rwxrwxr-x 1 msmiley msmiley 1385872 Feb 17 10:45 redis-benchmark.strip-debug
-rwxrwxr-x 1 msmiley msmiley 1317888 Feb 17 10:45 redis-benchmark.strip-all
matt-smiley commented 2 years ago

Now that I posted it, I'm worried my last comment was maybe too long and detailed...

To sum up:

@miguelaeh any of the options you listed in https://github.com/bitnami/charts/issues/8071#issuecomment-1040237712 would be fine from my perspective. Anything that does not completely strip the symbol tables will support profiling redis, which is our primary goal.

Optionally, if we also preserve the debug sections (whether or not they are compressed), that helps some additional use-cases. (The details in my previous comment https://github.com/bitnami/charts/issues/8071#issuecomment-1043336123 are mostly just sharing background on that piece, since it's a bit more obscure -- but it's not our main goal here.)

Compressing the redis-server binary's debug sections should save 6.9 MB (shrinking it from 12.2 MB to 5.3 MB). From your notes in https://github.com/bitnami/charts/issues/8071#issuecomment-1040237712, I thought you were seeing no effect from compression, which surprised me. I'm curious about that, but it doesn't have to be a blocker. I agree with you and @igorwwwwwwwwwwwwwwwwwwww that either option 2 or 3 would be fantastic -- option 2 (-gz) gives the best observability features and option 3 (-g0) gives the best size, and either is a huge step forward by preserving the symbol tables. Your container image is quite pleasantly small either way, from my perspective.

Also, thank you again @miguelaeh and @javsalgar for working with us on this! Fixing the profiling support is going to help a lot of folks, and I think this is the last piece!

miguelaeh commented 2 years ago

Hi @matt-smiley , Thanks for the detailed info.

I did a re-test of compressing the debug info, just for curiosity as to check if I did something wrong the first time. I did not exactly obtain the size you mentioned but a similar one.

In short, I built it with '-g0 -fno-omit-frame-pointer' and after the build executed objcopy --compress-debug-sections redis-server as per your recommendation.

The following are the results, I think this is exactly what we were looking for right?

 file redis/bin/redis-server
redis/bin/redis-server: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=f1d0c371fc67ef2697f9af625596ddd2eec96992, with debug_info, not stripped
 ls -lh redis/bin/redis-server
-rwxr-xr-x 1 bitnami bitnami 2.6M Feb 18 10:12 redis/bin/redis-server
matt-smiley commented 2 years ago

@miguelaeh thanks a bunch for sharing this output!

That was a surprising result, but I think I've figured it out.

Unfortunately that 2.6 MB build of redis-server is missing most of its debug sections after all. I think you're running into a subtle inconsistency in Redis's Makefiles (specifically the hiredis Makefile). When you set the -g0 option, this tells gcc to not create debug sections. But that flag is being overridden in the hiredis Makefile. Consequently, the redis-server binary only has debug info for that one small part of redis, not for the whole thing.

Let me know if more details would be helpful. I'm trying to keep this short, but happy to share more if it helps.

Now that we know about this Makefile defect, I think we should probably do this:

miguelaeh commented 2 years ago

Hi @matt-smiley ,

Unfortunately that 2.6 MB build of redis-server is missing most of its debug sections after all

Could you clarify how do you know that with the information I shared? Just curiosity.

Apart from that, with the suggested options the redis-server size is 12MB:

 ls -lh redis/bin/redis-server
-rwxr-xr-x 1 bitnami bitnami 12M Feb 21 10:56 redis/bin/redis-server

file redis/bin/redis-server
redis/bin/redis-server: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=09498ef6bf654ebead65c141da937f44a1a910f0, with debug_info, not stripped
matt-smiley commented 2 years ago

Apart from that, with the suggested options the redis-server size is 12MB

@miguelaeh wow, that's surprising. Would you mind uploading that 12 MB redis-server file, so I can look at the binary itself?

I'm not sure why the compression is not working, and I might be able to find a clue if I can look at the unstripped binary. (It would also help to see more details about the build script, but since that's not open source, I'm not asking to see that.) It might be an aspect of the build environment like the silent suppression of -gz that @igorwwwwwwwwwwwwwwwwwwww described in https://github.com/bitnami/charts/issues/8071#issuecomment-1040279529, but I figured it's worth taking a peek at the unstripped binary since the debug info it contains may give some insight.

Again, not a blocker, I'm just curious.

Could you clarify how do you know that with the information I shared? Just curiosity.

Absolutely! Here's the short version, let me know if you'd like more details.

After a little trial and error, I found that I can build a redis-server binary that matches your description:

root@52f6d8c2f546:/build/redis/src# make CFLAGS='-g0' all
...

root@52f6d8c2f546:/build/redis/src# ls -lh redis-server 
-rwxr-xr-x 1 root root 2.6M Feb 19 08:30 redis-server

root@52f6d8c2f546:/build/redis/src# readelf --sections --wide redis-server | grep debug
  [30] .debug_aranges    PROGBITS        0000000000000000 2101db 000170 00      0   0  1
  [31] .debug_info       PROGBITS        0000000000000000 21034b 014a24 00      0   0  1
  [32] .debug_abbrev     PROGBITS        0000000000000000 224d6f 0020e8 00      0   0  1
  [33] .debug_line       PROGBITS        0000000000000000 226e57 00aa72 00      0   0  1
  [34] .debug_str        PROGBITS        0000000000000000 2318c9 001e5d 01  MS  0   0  1
  [35] .debug_loc        PROGBITS        0000000000000000 233726 0204c8 00      0   0  1
  [36] .debug_ranges     PROGBITS        0000000000000000 253bee 005850 00      0   0  1

That seemed distinctive enough to probably match what you were getting from your build environment, so I did some analysis on that binary.

That 2.6 MB binary's debug sections only covered 6 source code files (rather than roughly 115), and all of them were under the hiredis component. There was no debug info for the rest of the redis source code.

Here's a short demo:

List the compilation units that have entries in the ".debug_line" section:

root@52f6d8c2f546:/build/redis/src# objdump --dwarf=decodedline redis-server | grep '^CU:' 
CU: ./dict.c:
CU: ./alloc.c:
CU: ./net.c:
CU: ./hiredis.c:
CU: ./sds.c:
CU: ./read.c:

Show the full path for those source files in the build environment. This tells us they were all under the "hiredis" dependency, which has its own Makefile.

root@52f6d8c2f546:/build/redis/src# objdump --disassemble -M intel --line-numbers redis-server | grep -o -e '.*/dict\.c:' -e '.*/alloc\.c:' -e '.*/net\.c:' -e '.*/hiredis\.c:' -e '.*/sds\.c:' -e '.*/read\.c:' | sort -u
/build/redis/deps/hiredis/alloc.c:
/build/redis/deps/hiredis/dict.c:
/build/redis/deps/hiredis/hiredis.c:
/build/redis/deps/hiredis/net.c:
/build/redis/deps/hiredis/read.c:
/build/redis/deps/hiredis/sds.c:

Looking at the Makefile for hiredis (deps/hiredis/Makefile), I found that it may be composing its gcc arguments in a different order from the other redis makefiles, and that turned out to be true:

DEBUG_FLAGS?= -g -ggdb
REAL_CFLAGS=$(OPTIMIZATION) -fPIC $(CPPFLAGS) $(CFLAGS) $(WARNINGS) $(DEBUG_FLAGS)

When I ran make, I was setting the CFLAGS variable to -g0, but the hiredis Makefile sets its DEBUG_FLAGS variable to the conflicting -g (which is equivalent to -g2). When conflicting settings are passed to gcc, the last one wins. So because DEBUG_FLAGS is later in the args list than CFLAGS, the -g argument wins -- which is why the hiredis object file has debug sections, and the other object files linked into redis-server do not.

So that gives us a compelling explanation for why the redis-server binary was smaller than expected even though it included some uncompressed debug sections -- those sections only covered a tiny portion of the redis source code.

miguelaeh commented 2 years ago

Sorry guys, I missed the notification of the response.

Thank you @matt-smiley for the details.

Since there are weird behaviors as you commented with the hiredis. To avoid extending this too much in time, I think we could proceed with option 3:

Skipping the strip only for redis-server plus adding -g0. Produced just an increase of 0.5 MB.

It will just add 0.5MB to the image and from what we talked about, looks to be enough for the case.

matt-smiley commented 2 years ago

@miguelaeh that sounds great, let's do that, thank you!

miguelaeh commented 2 years ago

Hi! I launched the internal jobs to release the new version, sorry for the time it took, I forgot to remove the objcopy the first time and I needed to re-run them. The new image should appear in some hours

miguelaeh commented 2 years ago

The images have already been released, please check them and let's know.

matt-smiley commented 2 years ago

@miguelaeh thank you! This looks great! The redis-server binary does have both frame pointers and symbol tables, so we can get high quality profiles like the following flamegraph.

# Verify symbols are preserved.

$ nm /tmp/bitnami-redis-server | wc -l
4915

# Verify frame pointers are used.

$ objdump --disassemble /tmp/bitnami-redis-server | grep 'mov *%rsp,%rbp' | wc -l
3192

# Verify profiling works as intended.

$ docker pull bitnami/redis:6.2

$ docker run --rm -e ALLOW_EMPTY_PASSWORD=yes bitnami/redis:6.2

$ sudo perf record -g --freq 497 --pid $( pgrep redis-server ) -- sleep 30
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.035 MB perf.data (39 samples) ]

$ sudo perf script | stackcollapse-perf.pl --kernel | flamegraph.pl --hash --colors=perl > /tmp/flamegraph.bitnami-redis-server.svg

flamegraph bitnami-redis-server

miguelaeh commented 2 years ago

Great!!

Can we close the issue then?

matt-smiley commented 2 years ago

I think probably so, but there's one more check I want to run to make sure. I didn't have time to get to it today, but I will do so on Monday and report back here.

matt-smiley commented 2 years ago

I finished an end-to-end validation run using the helm chart and got the same good results from profiling a toy workload. This looks fantastic! Thank you very much!

igorwwwwwwwwwwwwwwwwwwww commented 2 years ago

Huge thanks @miguelaeh and @javsalgar, I'll go ahead and close this out!