ROCm / amdsmi

AMD SMI
https://rocm.docs.amd.com/projects/amdsmi/en/latest
MIT License
39 stars 23 forks source link

Explicitly specify data_type in capture #30

Closed danzimm closed 2 weeks ago

danzimm commented 4 months ago

When compiling with clang-15 we see errors like the following:

third-party/rocm/6.1/amdsmi/rocm_smi/src/rocm_smi_gpu_metrics.cc:460:58: error: reference to local binding 'data_type' declared in enclosing function 'amd::smi::format_metric_row'
      amdgpu_dynamic_metric_value_init.m_original_type = data_type;
                                                         ^
third-party/rocm/6.1/amdsmi/rocm_smi/src/rocm_smi_gpu_metrics.cc:446:15: note: 'data_type' declared here
  const auto [data_type, num_values] = get_data_type_info();

It looks like this is a weird gotcha with the standard: https://stackoverflow.com/questions/46114214/lambda-implicit-capture-fails-with-variable-declared-from-structured-binding

In order to support toolchains that mark this as an error I've explicitly captured a copy of data_type to eliminate the error.

N.B. The copy was happening anyway, eventually, so I think there shouldn't be any practical difference.

oliveiradan commented 3 months ago

@danzimm, thank you for the heads up here. The interesting fact here is that C++20 will allow the structure binding capture. That works since C++17 with gcc; however, it will fail to build with clang 14 and 15 in both C++17' and' C++20 standards. Even though the proposed change works, it makes an additional copy of the captured variable (which already happened before, but since we will be work on it). We will try to address that with something like and recheck it with clang and gcc:

auto amdgpu_dynamic_metric_value = [&, data_type=&data_type]()
danzimm commented 2 months ago

Can you clarify why the copy is a problem here? Since data_type is smaller than a word I think this copy should be at least as fast as the reference when creating the closure, and within the closure should be faster since there won't be a deref-- we can double check with godbolt if need be.

I'm hoping to get this merged so that we (at Meta) don't have to maintain this extra patch going forward.

oliveiradan commented 2 months ago

@danzimm, the copy is not a problem. We just wanted to validate the change with a few different versions of the compiler and take advantage of that to determine whether/how to improve the copy in question. I tested with clang 12-14 and gcc 11-12 and the build didn't show any errors.

danzimm commented 2 months ago

Apologies on two fronts: firstly for the delay (PTO + sick) and secondly on a typo-- we're using clang 15 not 12, not sure why I put 12 in the initial summary.

Here's a repro with copy/pasted code from the repo with clang 14 (also occurs with clang 15): https://godbolt.org/z/8rPMPs368

oliveiradan commented 2 months ago

Apologies on two fronts: firstly for the delay (PTO + sick) and secondly on a typo-- we're using clang 15 not 12, not sure why I put 12 in the initial summary.

Here's a repro with copy/pasted code from the repo with clang 14 (also occurs with clang 15): https://godbolt.org/z/8rPMPs368

oliveiradan commented 2 months ago

We will recheck it on Clang15 and the newer version of GCC, just in case.

dmitrii-galantsev commented 2 weeks ago

@oliveiradan @danzimm Indeed, shows a warning with newer clang versions (16 and up). And fails to compile below 16. GCC doesn't care either way.

https://godbolt.org/z/c7PP4TzPW

dmitrii-galantsev commented 2 weeks ago

please ignore the "2 commits in this PR" right now. Trying to rebase all PRs.

dmitrii-galantsev commented 2 weeks ago

ok i guess i had to do 50 rebases to get it closed on github automagically...

Merged in https://github.com/ROCm/amdsmi/commit/91199279b0d23b967e993e85ab7fd2ab82ce314a