Closed n-soda closed 7 years ago
@n-soda Note the Macro: SHA1DC_FORCE_LITTLEENDIAN -- I would prefer if we move to use this, as opposed to add cases to these checks, which we should probably simplify greatly (or remove.) Is there anyway that you can simply modify the Makefile that you use, or add to the Makefile we use, to define this macro instead?
I just committed (a24eef5) a different fix: If any of _BYTE_ORDER, BYTE_ORDER or __BYTE_ORDER is defined then it will check its value against to corresponding _BIG_ENDIAN, otherwise it checks for the existance of any known macro that indicates a big endian platform.
I hope this makes endianness selection more correct on the platforms with issues. Can you please confirm if this fixes your issues on *BSD ?
Cygwin has exactly the same problem (which I discovered via Git including this library), and I can confirm that a24eef5 fixes things there.
@n-soda, @me-and : Could you please test my PR #36. It hopefully fixes this issue & more by mainly relying on GCC/Clang detection rather than more fragile macros that may differ between platforms like Solaris & NetBSD.
My PR #36 has been merged, so this shouldn't be an issue anymore, we don't check _BIG_ENDIAN
at all now. I don't have permission to close this PR though, @cr-marcstevens ?
I'm sorry for the delay of the repsonse. unfortunately, latest master (3f14d1bb) doesn't work correctly on some big endian architectures with modern BSDs (e.g. m68k, some of power/powerpc machines, Super-H, HP PA). So, could you re-add the "defined(_BYTE_ORDER) && defined(_BIG_ENDIAN)" test again? modern BSDs need this.
historically, 4BSD originally defined BYTE_ORDER/BIG_ENDIAN/LITTLE_ENDIAN macros (these symbols are still defined in some cases on modern BSDs for backward compatibility, but they are deprecated), and then modern BSDs defined the _BYTE_ORDER/_BIG_ENDIAN/_LITTLE_ENDIAN macros (one underscore) to avoid namespace pollution, Linux followed similar convention, but used two underscores (i.e. BYTE_ORDER/__BIG_ENDIAN/LITTLE_ENDIAN) instead. So, both the one underscore version and the two underscores version are needed at least. It seems Cygwin uses same convention with modern *BSDs, accourding to https://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git;a=blob;f=newlib/libc/include/machine/endian.h;h=34a5726e6871a6c06b420790e9305c3236ee1cb8;hb=HEAD
@n-soda: Makes sense, but just for my own sanity, what compiler/version are you using on those BSDs? I had someone test it on Cygwin and according to them it worked, which I think is because they're using a GCC newer than 4.6.0.
@n-soda: Also, can you tell me if 56ab30c works for you?
@avar 56ab30c doesn't work, because it treats all *BSD and newlib as big endian even if the architecutre is little endian. About gcc version, latest NetBSD (NetBSD-7 and later) is OK because it is using gcc-4.8.5, but NetBSD-6 is NOT OK, because NetBSD-6 is using gcc-4.5.3 and it's still supported, so we cannot reply on the gcc specific macros. Also, I guess some embeded systems which are using newlib (newlib is used by Cygwin, but also by some Linux-based embeded systems) may be using older gcc too.
FWIW, ade6b5d8 in my repo works for me. Does this work on Solaris? (I can test it on Solaris 9/sparc with very old gcc, but not on latest Solaris)
@avar According to http://man.openbsd.org/gcc-local.1 , OpenBSD is using gcc-3.3.6 or 4.2.1, because they won't accept GPL version 3. So, OpenBSD is NOT OK either about the gcc specific macros.
@n-soda: Makes sense, b.t.w. that commit is pretty unreadable due to various merging etc. and being originally written against the old version. I took the liberty of squashing a version of that and pushing it to my repo: https://github.com/avar/sha1collisiondetection/commit/fd166aacc1fd4c6af2fa745e10f2cf877d0992b8
That makes it easy to see what the net change is.
This change as it stands will break Solaris. The reason for this is that on bendian systems they just have:
#define _BIG_ENDIAN
See http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/sys/isa_defs.h#397
They also define _BYTE_ORDER, thus this will error out in the same way this would:
#define _BYTE_ORDER 123
#define _BIG_ENDIAN
#if defined(_BYTE_ORDER) && defined(_BIG_ENDIAN) && _BYTE_ORDER == _BIG_ENDIAN
#endif
But obviously the current thing sucks since it breaks various BSDs. Ways out of this that I can see, in no particular order:
To support older GCCs we could just add more hardcoded processors, e.g. you note a breakage on m68k/ppc, we could add some macros to pick up the processor type, see https://wiki.debian.org/ArchitectureSpecificsMemo
We could essentially do what your patch is doing, but just guard the Solaris-specific code with "is this Solaris". Your c09e03e does this, but from my IRC chat with a guy at Oracle there is no sun Macro or anything like that in his /usr/include, I think sun is purely a GNU-land thing to check if something is Solaris.
But surely there is some way to check this, but I haven't found it,
the docs I've found on the matter lie (there's no sun
either on
older Solaris):
http://www.oracle.com/technetwork/server-storage/solaris/portingtosolaris-138514.html
We could guard the stuff you're adding with "is this <bsd, cygwin etc>". This probably sucks more than the above, I have a feeling Solaris is the odd one out.
We could just check with some macro magic whether _BIG_ENDIAN et al are empty. I'm actually liking this the most at the moment, see https://public-inbox.org/git/20170627162808.p7sj3jpwum7n3hq2@oracle.com/
@avar
I took the liberty of squashing a version of that and pushing it to my repo: avar/sha1collisiondetection@fd166aa
Thanks!
This change as it stands will break Solaris.
Hmm, how about 0b0e12e8 then?
@avar BTW, the following symbols were available on Solaris at the age of SUN Works Pro SC-3.0, some of them may be still available with newer Solaris toolchain:
@n-soda : Looks good, I just pushed an alternate approach of doing this, not saying it's better, just something I was hacking on, what do you think about that way of doing it?
WRT the sun macros: I just have testing others did on SunOS # *define *_*sun
didn't yield anything interesting, but maybe we just missed something.
@avar
what do you think about that way of doing it?
re: 94af1a61 , I'm not sure whether the _BIG_ENDIAN macro on Solaris/sparc will be defined as null string in the distant future as well or not. I guess checking defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN) is more safer.
about the "sun" macro, grepping under /usr/include cannot find the symbol, because the symbol is defined by the cc command itself instead of the headers.
Yeah I think your change is simpler & better. I was just hacking on mine concurrently and thought I'd note it.
I think it makes sense though to move the logic you added to the "We do nothing more here for now" condition. I.e. now we do:
gcc? -> glibc? -> becpu -> !lecpu -> nothing
Your change makes it:
gcc? -> glibc? -> bsd/cygwin _BIG_ENDIAN -> becpu -> !lecpu -> nothing
I think it makes more sense to make it:
gcc? -> glibc? -> becpu -> !lecpu -> bsd/cygwin _BIG_ENDIAN -> nothing
Any version of Solaris ends up in the "becpu" test since it checks for __sparc, so by putting your _BIG_ENDIAN logic after that you avoid any issues with them doing something weird in the future (or on an older version we've missed).
I also think it makes sense to put this after the CPU whitelist/blacklist, that's probably more reliable than what are likely to be more platform-specific macros.
@avar the gcc?, glibc? and bsd/newlib? tests are essentially similar tests, so I think it's better to put them together. So, how about this order?
becpu -> !lecpu -> gcc? -> glibc? -> bsd/newlib -> nothing?
@n-soda: I think it makes more sense to check gcc & glibc first. E.g. SPARC v9 apparently (I'm told by a guy at Oracle) can be configured to run in LE mode, but sparc will still be true, now that's not a practical issue because with a newer GCC we'll return the right thing.
In general it seems like a better idea to trust the compiler/libc first and then start trying to make sense of CPU values. As the Debian page I linked to shows for ppc64 it'll also be easy to introduce subtle regressions on some archs that are mixed BE/LE with future patches if we use this pattern.
I'm very glad that we have so many endian cases covered now. After this has settled and merged, it is probably worthwhile to put it in its own header, so it can also easily be used by other projects. Or if you guys want to maintain such an preprocessor endianness detection header somewhere then we're also happy to include that.
Also, I noticed that the following detects big endian for a situation, but still falls through to the else-case when it could detect little endian:
#if defined(_BYTE_ORDER) && defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN) && \
_BYTE_ORDER == _BIG_ENDIAN
#define SHA1DC_BIGENDIAN
#else
Should the final check be inside the if-block, or is the above the intended logic?:
#if defined(_BYTE_ORDER) && defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN)
#if _BYTE_ORDER == _BIG_ENDIAN
#define SHA1DC_BIGENDIAN
#endif
#else
@cr-marcstevens
Should the final check be inside the if-block, or is the above the intended logic?:
Yeah, that's better.
@avar
think it makes more sense to check gcc & glibc first. E.g. SPARC v9 apparently (I'm told by a guy at Oracle) can be configured to run in LE mode, but sparc will still be true, now that's not a practical issue because with a newer GCC we'll return the right thing.
If this is the issue, the *BSD/newlib check has to be placed before the becpu check and the !lecpu check as well. Otherwise a theoretical little-endian sparc port of OpenBSD won't work.
If the defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN) check doesn't work with SPARC/solaris, we have no choice, but if this works, we can safely place the *BSD/newlib check before the becpu check and the !lecpu check.
On Wed, Jun 28 2017, n-soda jotted:
@avar
think it makes more sense to check gcc & glibc first. E.g. SPARC v9 apparently (I'm told by a guy at Oracle) can be configured to run in LE mode, but sparc will still be true, now that's not a practical issue because with a newer GCC we'll return the right thing.
If this is the issue, the *BSD/newlib check has to be placed before the becpu check and the !lecpu check as well. Otherwise a theoretical little-endian sparc port of OpenBSD won't work.
Indeed, I think it makes sense to do that, i.e.:
gcc? -> glibc? -> bsd/cygwin _BIG_ENDIAN -> becpu -> !lecpu -> nothing
Provided that the bsd/cygwin check won't falsely fire on Solaris due to checking _LITTLE_ENDIAN.
If the defined(_BIG_ENDIAN) && defined(_LITTLE_ENDIAN) check doesn't work with SPARC/solaris, we have no choice, but if this works, we can safely place the *BSD/newlib check before the becpu check and the !lecpu check.
From what I've seen of Solaris the check you suggested for _LITTLE_ENDIAN should definitely work, and seems liket he simplest thing to do.
@n-soda : Just so we don't duplicate work / nobody does anything, I understand from the discussion above that you'll work on some patch for this, if you're not willing to do that / don't have time I can try to put something together per the discussion above.
Thank you guys for your attention to this matter. Marc and I are heads down on a camera-ready version of a paper (about this algorithm actually (-: ). When you guys settle on a final fix @avar pls test as you did for the last patch and @n-soda please confirm this fixes the platforms/compilers you were looking at, I'll happily merge.
@cr-marcstevens I applied your comment in cfd84f8 . BTW, I prefer 37f78d0 to cfd84f8 because the former's nesting level is shallower. How do you think about this?
@avar Is this OK for you?
Thanks! BTW I also think that using #elif instead of nested #if really improves readability.
@n-soda Looks good to me.
I didn't use #elif
because I misremembered that there were portability issues with it, but I was just mixing it up with make
. That's indeed a lot more readable!
@cr-marcstevens thanks. I merged the #elif version in 37f78d0 .
@avar your memory is correct. #elif didn't exist before ANSI-C (C89), but that's more than 20 years ago, and we can safely ignore such old compilers.
@n-soda @cr-marcstevens @shumow : As noted above, I think the end result of this merge request is good, but the history was a mess. Let's merge my rebased https://github.com/cr-marcstevens/sha1collisiondetection/pull/37 instead, and close this one.
@cr-marcstevens @shumow
Let's merge my rebased #37 instead, and close this one.
Yeah, please do so.
@avar Thanks for cleaning up the history.
Closing as this was rebased and cleaned up in the now merged: https://github.com/cr-marcstevens/sha1collisiondetection/pull/37
Little endian platforms on all *BSD variants are broken since commit 33a694a9ee1b, because _BIG_ENDIAN is always defined even on little endian platforms.
problem found by nonaka at NetBSD.org