google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web
Other
1.87k stars 371 forks source link

xnn_table_exp2minus_k_over_16 has inconsistent types #6806

Open aiusepsi opened 3 months ago

aiusepsi commented 3 months ago

The type of xnn_table_exp2minus_k_over_16 varies between declarations.

It's defined as uint32_t[64] here but other declarations have various different types.

For example, here it's declared as float[64].

This is illegal in C as uint32_t and float are not compatible types (just being the same size isn't sufficient to be compatible) so strictly speaking this is undefined behaviour. I hit an error with this while compiling with LTO, so the work-around is probably "don't do that", but it is still incorrect even if it happens to work.

One possible solution is to declare the array as float[64], and use hexadecimal float literals, e.g. 0x1p+0 instead of 0x3F800000, 0x1.fec9a4p-1 instead of 0x3F7F64D2, etc., which should still be bit-for-bit identical.

fbarchard commented 3 months ago

Thanks for the report Looking at the original version, it used to be floats but was recently changed to int. I found the change that did it, but its not documented why. I'm inclined to go ahead and change it back. If you like you can try it on your end to ensure it works and if any other changes have the same issue

// Copyright 2019 Google LLC // // This source code is licensed under the BSD-style license found in the // LICENSE file in the root directory of this source tree.

include <xnnpack/common.h>

// Table of exp2(k / 64) values, k = 0..63 XNN_INTERNAL const float xnn_table_exp2_k_over_64[64] = { 0x1.000000p+0f, 0x1.02C9A4p+0f, 0x1.059B0Ep+0f, 0x1.087452p+0f, 0x1.0B5586p+0f, 0x1.0E3EC4p+0f, 0x1.11301Ep+0f, 0x1.1429AAp+0f, 0x1.172B84p+0f, 0x1.1A35BEp+0f, 0x1.1D4874p+0f, 0x1.2063B8p+0f, 0x1.2387A6p+0f, 0x1.26B456p+0f, 0x1.29E9E0p+0f, 0x1.2D285Ap+0f, 0x1.306FE0p+0f, 0x1.33C08Cp+0f, 0x1.371A74p+0f, 0x1.3A7DB4p+0f, 0x1.3DEA64p+0f, 0x1.4160A2p+0f, 0x1.44E086p+0f, 0x1.486A2Cp+0f, 0x1.4BFDAEp+0f, 0x1.4F9B28p+0f, 0x1.5342B6p+0f, 0x1.56F474p+0f, 0x1.5AB07Ep+0f, 0x1.5E76F2p+0f, 0x1.6247ECp+0f, 0x1.662388p+0f, 0x1.6A09E6p+0f, 0x1.6DFB24p+0f, 0x1.71F75Ep+0f, 0x1.75FEB6p+0f, 0x1.7A1148p+0f, 0x1.7E2F34p+0f, 0x1.82589Ap+0f, 0x1.868D9Ap+0f, 0x1.8ACE54p+0f, 0x1.8F1AEAp+0f, 0x1.93737Cp+0f, 0x1.97D82Ap+0f, 0x1.9C4918p+0f, 0x1.A0C668p+0f, 0x1.A5503Cp+0f, 0x1.A9E6B6p+0f, 0x1.AE89FAp+0f, 0x1.B33A2Cp+0f, 0x1.B7F770p+0f, 0x1.BCC1EAp+0f, 0x1.C199BEp+0f, 0x1.C67F12p+0f, 0x1.CB720Ep+0f, 0x1.D072D4p+0f, 0x1.D5818Ep+0f, 0x1.DA9E60p+0f, 0x1.DFC974p+0f, 0x1.E502EEp+0f, 0x1.EA4AFAp+0f, 0x1.EFA1BEp+0f, 0x1.F50766p+0f, 0x1.FA7C18p+0f, };

fbarchard commented 2 months ago

I found int and int32_t variations of this, and fixed those. Now its just the float extern that is sometimes used.

grep extern.*xnn_table_exp2minus_kover . -r -h | sort | uniq These 3 cause type mismatch: extern XNN_INTERNAL const float xnn_table_exp2minus_k_over_16[16]; extern XNN_INTERNAL const float xnn_table_exp2minus_k_over_2048[2048]; extern XNN_INTERNAL const float xnn_table_exp2minus_k_over_64[64]; These are extern'ed consistently: extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_kover${LUT}[${LUT}]; extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_k_over_16[16]; extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_k_over_2048[2048]; extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_k_over_32[32]; extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_k_over_4[4]; extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_k_over_64[64]; extern XNN_INTERNAL const uint32_t xnn_table_exp2minus_k_over_8[8];

eli-schwartz commented 2 months ago

If you like you can try it on your end to ensure it works and if any other changes have the same issue

@fbarchard did you ever commit those changes? I don't see e.g. any links to commits that fixed it.

I tried to test a recent snapshot. The previous one where I saw the inconsistent types was from 2024.02.29. Trying the latest version I actually cannot even get far enough to hit errors with LTO, as I get tons of -Werror=strict-aliasing in src/xnnpack/simd/f32-scalar.h first.

e.g.

In file included from /var/tmp/portage/sci-libs/XNNPACK-2024.09.04/work/XNNPACK-35af48543c1666feb840f74560eb5b9bfdf9b348/src/f32-vunary/gen/f32-vneg-scalar.c:14:
/var/tmp/portage/sci-libs/XNNPACK-2024.09.04/work/XNNPACK-35af48543c1666feb840f74560eb5b9bfdf9b348/src/xnnpack/simd/f32-generic-functions.h: In function ‘xnn_generic_getexp_f32’:
/var/tmp/portage/sci-libs/XNNPACK-2024.09.04/work/XNNPACK-35af48543c1666feb840f74560eb5b9bfdf9b348/src/xnnpack/simd/f32-scalar.h:28:31: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
   28 |   const xnn_simd_f32_t var = *(const float *)&_##var##_int_value;
      |                               ^~~~~~~~~~~~~~~~~
/var/tmp/portage/sci-libs/XNNPACK-2024.09.04/work/XNNPACK-35af48543c1666feb840f74560eb5b9bfdf9b348/src/xnnpack/simd/f32-generic-functions.h:34:3: note: in expansion of macro ‘XNN_SIMD_CONST_U32’
   34 |   XNN_SIMD_CONST_U32(exp_mask, 0x7f800000);
      |   ^~~~~~~~~~~~~~~~~~
/var/tmp/portage/sci-libs/XNNPACK-2024.09.04/work/XNNPACK-35af48543c1666feb840f74560eb5b9bfdf9b348/src/xnnpack/simd/f32-scalar.h: In function ‘xnn_and_f32’:
/var/tmp/portage/sci-libs/XNNPACK-2024.09.04/work/XNNPACK-35af48543c1666feb840f74560eb5b9bfdf9b348/src/xnnpack/simd/f32-scalar.h:87:25: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
   87 |   const uint32_t res = *(const uint32_t *)&a & *(const uint32_t *)&b;
      |                         ^~~~~~~~~~~~~~~~~~~~

Attached is a full log of the entire failure: build.log

Skylion007 commented 3 weeks ago

https://github.com/pytorch/pytorch/pull/137866 for has some runs with full stack traces and repros if you need some more reproducers. Here is the PyTorch HUD: https://hud.pytorch.org/pr/pytorch/pytorch/137866#31470457008

Skylion007 commented 3 weeks ago

@eli-schwartz Those errors still exist on master so they were not committed.

Skylion007 commented 3 weeks ago

@fbarchard Posted full build logs and happy to try master when they are fixed, I posted the full error logs in my above comment. Simple search suggests the problem still appears on master.

Skylion007 commented 3 weeks ago

@fbarchard Found the PRs you made ie: https://github.com/google/XNNPACK/pull/6832, unfortunately converting them to uint32 is still causing issues with LTO with GCC even if they are the same width.