rustyrussell / ccan

The C Code Archive Network
http://ccodearchive.net/
1.11k stars 206 forks source link

#define 'HAVE_UNALIGNED_ACCESS' and [-Wcast-align] warnings on 32bit armv7l #85

Open Chirimen-Jako opened 5 years ago

Chirimen-Jako commented 5 years ago

Now I'm chasing kind of the [-Wcast-align] warnings which happens when I build c-Lightning on 32bit armv7l machine.

ccan/ccan/crypto/sha256/sha256.c:213:22: warning: cast increases required alignment of target type [-Wcast-align]
    Transform(ctx->s, (const uint32_t *)data, blocks);
                      ^

Brief system info is:

$ uname -a  # (all)
Linux odroid 4.14.111-158 #1 SMP PREEMPT Tue Apr 16 12:26:31 -03 2019 armv7l armv7l armv7l GNU/Linux

$ cat /proc/cpuinfo
processor       : 0
model name      : ARMv7 Processor rev 3 (v7l)
BogoMIPS        : 90.00
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae

Full information is the following. warning.txt system-info.txt

I noticed the 'HAVE_UNALIGNED_ACCESS' macro affects some many-bits calculation especially for crypto.

$ find . -name '*.h' | xargs grep "UNALIGNED"
./ccan/config.h:#define HAVE_UNALIGNED_ACCESS 1
./external/libwally-core/src/config.h:#define HAVE_UNALIGNED_ACCESS 1
./external/libwally-core/src/ccan_config.h:#if HAVE_UNALIGNED_ACCESS

It means the following in my understanding. Right? HAVE_UNALIGNED_ACCESS=0 --> pointers are aligned appropriately in advance. HAVE_UNALIGNED_ACCESS=1 --> pointers may not be aligned. need check and re-align.

If my understanding is correct, the following logic is upside-down? CURRENT: lightning/ccan/ccan/crypto/sha256/sha256.c

170 static bool alignment_ok(const void *p UNUSED, size_t n UNUSED)
171 {
172 #if HAVE_UNALIGNED_ACCESS
173         return true;
174 #else
175         return ((size_t)p % n == 0);
176 #endif
177 }

should be:

170 static bool alignment_ok(const void *p UNUSED, size_t n UNUSED)
171 {
172 #if HAVE_UNALIGNED_ACCESS
173         return ((size_t)p % n == 0);
174 #else
175         return true;
176 #endif
177 }

the above code affects the following process.

195                 /* Process full chunks directly from the source. */
196                 if (alignment_ok(data, sizeof(uint32_t)))
197                         Transform(ctx->s, (const uint32_t *)data);
198                 else {
199                         memcpy(ctx->buf.u8, data, sizeof(ctx->buf));
200                         Transform(ctx->s, ctx->buf.u32);
201                 }

Could you take a look at the codes and give me a comment? Thanks in advance.

Chirimen-Jako commented 5 years ago

Additional Info: lightning/external/libwally-core/src/ccan_config.h has also same logic.

 31 #if HAVE_UNALIGNED_ACCESS
 32 #define alignment_ok(p, n) 1
 33 #else
 34 #define alignment_ok(p, n) ((size_t)(p) % (n) == 0)
 35 #endif

Sould be the following?

 31 #if HAVE_UNALIGNED_ACCESS
 32 #define alignment_ok(p, n) ((size_t)(p) % (n) == 0)
 33 #else
 34 #define alignment_ok(p, n) 1
 35 #endif
ccan-maintainers commented 5 years ago

Jako Chirimen notifications@github.com writes:

Now I'm chasing kind of the [-Wcast-align] warnings which happens when I build c-Lightning on 32bit armv7l machine.


ccan/ccan/crypto/sha256/sha256.c:213:22: warning: cast increases required alignment of target type [-Wcast-align]
    Transform(ctx->s, (const uint32_t *)data, blocks);
                      ^

Interesting!

I noticed the 'HAVE_UNALIGNED_ACCESS' macro affects some many-bits calculation especially for crypto.

$ find . -name '*.h' | xargs grep "UNALIGNED"
./ccan/config.h:#define HAVE_UNALIGNED_ACCESS 1
./external/libwally-core/src/config.h:#define HAVE_UNALIGNED_ACCESS 1
./external/libwally-core/src/ccan_config.h:#if HAVE_UNALIGNED_ACCESS

It means the following in my understanding. Right? HAVE_UNALIGNED_ACCESS=0 --> pointers are aligned appropriately in advance. HAVE_UNALIGNED_ACCESS=1 --> pointers may not be aligned. need check and re-align.

No. HAVE_UNALIGNED_ACCESS=1 means you can access pointers with "unnatural" alignment. x86, for example, allows you to do this.

Interestingly, my rPi thinks it can do it too, but perhaps unaligned accesses actually trap, and it would perform better with this unset?

Cheers, Rusty.

Chirimen-Jako commented 5 years ago

Thank you very much for your comment. Sorry for late response.

Please forgive me what I can is leave some unorganized comments for now because this problem includes some composite and difficult issues.

HAVE_UNALIGNED_ACCESS=0 --> pointers are aligned appropriately in advance. HAVE_UNALIGNED_ACCESS=1 --> pointers may not be aligned. need check and re-align.

No. HAVE_UNALIGNED_ACCESS=1 means you can access pointers with "unnatural" alignment. x86, for example, allows you to do this.

  1. Your answer is very useful information for me. Thanks again. Where is the detailed document about that kind of #define directives meaning and explanation?
  2. As you said, I know almost all x86 instructions allow “unnatural” alignment access except SIMD instructions. Some SIMD instructions, for instance, movdqa cannot unaligned access, if do it, raise exception. On the other hand, movdqu can unaligned access. But it may slow down because it reads memory twice over the each aligned boundary.
  3. In my opinion, just intuition, the existence of HAVE_UNALIGNED_ACCESS is “unnatural” coding because it expects and relies on a processor exact behavior too much.
  4. I wonder why you allow “unnatural” alignment access even though your code is optimized very well. “unnatural” alignment access might slow down your code.
  5. I know some x86 series itself is optimized enough fast for unnatural alignment access. But too many series to support by software.
  6. I found ARM document about memory alignment and instructions. I hope it’s useful for you unless you have read it already. How does the ARM Compiler support unaligned accesses?
  7. What is the oldest compiler to support CCAN repository? I’d like to insert actively some kind of __atribute__ ((aligned(x)) directives to the important pointers. Of caurse, CCAN can wrap that dicrectives by macro for compatibility.
  8. Interestingly, my rPi thinks it can do it too, but perhaps unaligned accesses actually trap, and it would perform better with this unset?

This is also my guess, it relates virtual memory allocates granularity, or, ARM processor’s failure. If the former, x86s allocates at least 4KB and didn’t touch non-commitment area by lucky. Anyway, As I told you at (3), to believe processor manufacturer’s document completely is sometimes risky in my opinion. I know Prof. Tanenbaum and David N. Cutler were consumed time by that kind of issues.

Current conclusion:

If I were you, when ARM processor detect by configure or make, I allocates alignment sensitive memory area by only aligned memory allocate functions. And, add __attribute__ ((aligned(x)) attributes to the all sensitive pointers. If target processor is x86(AMD), I don't change the result of processed code at all. I'll remove HAVE_UNALIGNED_ACCESS macro (at least, statically). I'm not sure for the other processors. (What's supported?)

Best Regards,