google / autofdo

AutoFDO
https://groups.google.com/forum/#!forum/autofdo
Apache License 2.0
530 stars 110 forks source link

create_gcov can not find binary with buildid and profile_creator can not read profile #184

Open bingxinliu opened 9 months ago

bingxinliu commented 9 months ago

Hi,

I compiled the Gromacs with "-Wl,--build-id" linking option and using perf to record perf.data, but create_gcov seems like can not find the binary with error message showing below:

# liubx @ liubx-nuc9-debian in ~/Development/codes/gromacs/perf-data [18:58:08]
$ ../autofdo-tools/build/create_gcov --binary=/home/liubx/Development/codes/gromacs/gromacs-2023.2-original-src/build-original/bin/gmx --profile=./loop1.perf --gcov=loop1.gcov -gcov_version=2
[WARNING:/home/liubx/Development/codes/gromacs/autofdo-tools/third_party/perf_data_converter/src/quipper/perf_reader.cc:1322] Skipping 136 bytes of metadata: HEADER_CPU_TOPOLOGY
[WARNING:/home/liubx/Development/codes/gromacs/autofdo-tools/third_party/perf_data_converter/src/quipper/perf_reader.cc:1069] Skipping unsupported event PERF_RECORD_ID_INDEX
[WARNING:/home/liubx/Development/codes/gromacs/autofdo-tools/third_party/perf_data_converter/src/quipper/perf_reader.cc:1069] Skipping unsupported event PERF_RECORD_CPU_MAP
[WARNING:/home/liubx/Development/codes/gromacs/autofdo-tools/third_party/perf_data_converter/src/quipper/perf_reader.cc:1069] Skipping unsupported event UNKNOWN_EVENT_82
[INFO:/home/liubx/Development/codes/gromacs/autofdo-tools/third_party/perf_data_converter/src/quipper/perf_reader.cc:1060] Number of events stored: 361069
[INFO:/home/liubx/Development/codes/gromacs/autofdo-tools/third_party/perf_data_converter/src/quipper/perf_parser.cc:272] Parser processed: 174 MMAP/MMAP2 events, 2 COMM events, 15 FORK events, 16 EXIT events, 359444 SAMPLE events, 359413 of these were mapped, 0 SAMPLE events with a data address, 0 of these were mapped
WARNING: Logging before InitGoogleLogging() is written to STDERR
E20240129 18:58:35.717491 250620 sample_reader.cc:95] Cannot find binary with buildid 9e166341c96797559485bb5c2e656da4c9b4e078
E20240129 18:58:35.956583 250620 profile_creator.cc:182] Error reading profile.

The thing is that actually readelf could print correct build id as shown below:

# liubx @ liubx-nuc9-debian in ~/Development/codes/gromacs/perf-data [18:58:36] C:255
$ readelf -n ../gromacs-2023.2-original-src/build-original/bin/gmx

Displaying notes found in: .note.gnu.property
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_PROPERTY_TYPE_0
      Properties: x86 ISA needed: x86-64-baseline

Displaying notes found in: .note.gnu.build-id
  Owner                Data size        Description
  GNU                  0x00000014       NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 9e166341c96797559485bb5c2e656da4c9b4e078

Displaying notes found in: .note.ABI-tag
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 3.2.0

Here is how I built Gromacs:

#!/bin/bash

cd gromacs-2023.2-original-src

mkdir -p ./build-original
cd build-original
rm -r *

CXX_OPTIONS="-Wl,--build-id"
C_OPTIONS="-Wl,--build-id"
LINK_OPTIONS="-Wl,--build-id"
cmake .. \
    -DGMX_BUILD_OWN_FFTW=ON -DREGRESSIONTEST_DOWNLOAD=ON \
    -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON \
    -DCMAKE_C_FLAGS="${C_OPTIONS}" -DCMAKE_CXX_FLAGS="${CXX_OPTIONS}" \
    -DCMAKE_EXE_LINKER_FLAGS="${LINK_OPTIONS}" \
    > cmake.output

make -j $1 > make.output

How I get the perf data:

#!/bin/bash

DIR_NAME=""
# Check if the first argument exists
if [ -z "$1" ]; then
    # Code to execute if the first argument does not exist
    echo "No argument provided. Use default test directory name: result"
    DIR_NAME="result"
else
    # Code to execute if the first argument exists
    echo "Argument provided: $1, use this name as test directory name"
    DIR_NAME=$1
    # Place your code here that does something with $1
fi

mkdir $DIR_NAME
cd $DIR_NAME

TEST_PATH="/home/liubx/Development/codes/gromacs/benchmark/water-cut1.0_GMX50_bare/1536"
GMX_PATH="/home/liubx/Development/codes/gromacs/gromacs-2023.2-original-src/build-original"
PERF_PATH="/home/liubx/Development/codes/gromacs/perf-data"

cp -r ${TEST_PATH}/* ./
${GMX_PATH}/bin/gmx grompp -f pme.mdp -o bench.tpr > output.result

run_test() {
    perf record -o ${PERF_PATH}/loop$1.perf \
        -b -e br_inst_retired.near_taken:pp -- \
        ${GMX_PATH}/bin/gmx mdrun \
        -resethway -npme 0 -notunepme -noconfout -nsteps 1000 -v -s  bench.tpr \
        > output.result$1
}

for i in {1..16}
do
    run_test $i
done

Any ideas about this problem? Appreciate for any help, thanks a lot!

erozenfeld commented 9 months ago

@bingxinliu If you can share out gmx and loop1.perf that you are passing to create_gcov, I can investigate your failure.

bingxinliu commented 9 months ago

@bingxinliu If you can share out gmx and loop1.perf that you are passing to create_gcov, I can investigate your failure.

@erozenfeld Hi, sorry for replying late, the log file is too large (~300MB) to share it via github, so I uploaded it to onedrive and shared it. The url is here and password is autofdo if necessary. Really appreciate for your help. THX!

erozenfeld commented 9 months ago

@bingxinliu It looks like your profile doesn't have events for binary with buildid 9e166341c96797559485bb5c2e656da4c9b4e078:

/home/erozen/bin/perf buildid-list -i ./loop1.perf
a6243ce6e9e5f490e7107d9e428b097150e55bb2 [kernel.kallsyms]
fb6b031703ec138b7bd76acdb1c3372dafe45472 [vdso]
901c89cff01d23aa026a99b31bcacf240b96e332 /home/liubx/Development/codes/gromacs/gromacs-2023.2-original-src/build-original/lib/libgromacs.so.8.0.0
51657f818beb1ae70372216a99b7412b8a100a20 /usr/lib/x86_64-linux-gnu/libc.so.6
e845af0c38643152c9cc58dab8540e6f57677eb4 /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
erozenfeld commented 9 months ago

@bingxinliu Looks like most of the samples are in libgromacs.so.8.0.0. Do you want to optimize gmx or libgromacs.so.8.0.0 with AutoFDO?

bingxinliu commented 9 months ago

@erozenfeld Thank you for your replying!

@bingxinliu Looks like most of the samples are in libgromacs.so.8.0.0. Do you want to optimize gmx or libgromacs.so.8.0.0 with AutoFDO?

I think gmx depends on libgromacs which is generated by the cmake option -DBUILD_SHARED_LIBS=ON, so they are basically the same in my opinion. Does this mean that AutoFDO does not support profiling shared library?

@bingxinliu It looks like your profile doesn't have events for binary with buildid 9e166341c96797559485bb5c2e656da4c9b4e078:

/home/erozen/bin/perf buildid-list -i ./loop1.perf
a6243ce6e9e5f490e7107d9e428b097150e55bb2 [kernel.kallsyms]
fb6b031703ec138b7bd76acdb1c3372dafe45472 [vdso]
901c89cff01d23aa026a99b31bcacf240b96e332 /home/liubx/Development/codes/gromacs/gromacs-2023.2-original-src/build-original/lib/libgromacs.so.8.0.0
51657f818beb1ae70372216a99b7412b8a100a20 /usr/lib/x86_64-linux-gnu/libc.so.6
e845af0c38643152c9cc58dab8540e6f57677eb4 /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0

Another problem is that I did perf the gmx as shown below. It should be ok to find the binary with buildid 9e166341c96797559485bb5c2e656da4c9b4e078

perf record -o ${PERF_PATH}/loop$1.perf \
        -b -e br_inst_retired.near_taken:pp -- \
        ${GMX_PATH}/bin/gmx mdrun \
        -resethway -npme 0 -notunepme -noconfout -nsteps 1000 -v -s  bench.tpr \
        > output.result$1

And the gmx indeed has the same buildid. Maybe most of gmx's code is just calling libgromacs so that no event is tagged with gmx. The problem may be caused by how I perf the program?

erozenfeld commented 9 months ago

@bingxinliu Yes, it doesn't look like any events are associated with gmx. All the work is done in libgromacs.so.8.0.0. create_gcov works on a per-binary basis. You probably want to optimize libgromacs.so.8.0.0 with AutoFDO since the work is done there. In that case you'll need to pass --binary=<path_to_libgromacs.so.8.0.0> and then make sure that the resulting .gcov file is used during libgromacs.so.8.0.0 build.