aklomp / base64

Fast Base64 stream encoder/decoder in C99, with SIMD acceleration
BSD 2-Clause "Simplified" License
866 stars 162 forks source link

Create release 0.5.0 #90

Closed aklomp closed 2 years ago

aklomp commented 2 years ago

This is a meta-issue to track the work needed before releasing v0.5.0. Closing this issue will trigger the release of v0.5.0.

BurningEnlightenment commented 2 years ago

I think that the cmake stuff has had enough bake time now and it might be wise to get that release out the door before we start doctoring with the runtime detection.

aklomp commented 2 years ago

I just checked off the last two remaining issues from the list. As far as I'm concerned, it's time for a release.

With my unfamiliarity with CMake, I'm unsure of exactly how solid the CMake stuff is now. We recently had some pull requests to push some minor issues that were found in production. Should we wait a week or two to see if something else shakes out?

I guess we could just release 0.5.0, and release a 0.5.1 if any bugs are found.

BurningEnlightenment commented 2 years ago

I think that the major things have been ironed out since we have been able to integrate it with three package managers and htot even cross compiled with the build system.

I guess we could just release 0.5.0, and release a 0.5.1 if any bugs are found.

Sounds reasonable :shipit:

fabianbuettner commented 2 years ago

Hi, I tried the CMakeLists.txt cross compiling for a i.MX6ULL which is a Armv7-A processor.

According to this document, "Armv7-A and AArch32 have the same general purpose Arm registers – 16 x 32-bit general purpose Arm registers (R0-R15)." But at the same time, it seems to support NEON: "Armv7-A and AArch32 have 32 x 64-bit NEON registers (D0-D31). These registers can also be viewed as 16x128-bit registers (Q0-Q15). Each of the Q0-Q15 registers maps to a pair of D registers, as shown in the following figure."

As you can see from the output below, somehow the CMakeLists.txt does not detect that I am cross compiling for ARM:

-- The following features have been enabled:

 * CLI, enables the CLI executable for encoding and decoding

-- The following features have been disabled:

 * OpenMP codec, spreads codec work accross multiple threads
 * SSSE3, add SSSE 3 codepath
 * SSE4.1, add SSE 4.1 codepath
 * SSE4.2, add SSE 4.2 codepath
 * AVX, add AVX codepath
 * AVX2, add AVX2 codepath
 * NEON32, add NEON32 codepath
 * NEON64, add NEON64 codepath

If I read the documentation correctly, my target ARCH should be ARM32 but with NEON enabled, right?

fabianbuettner commented 2 years ago

ah I think this line is the problem:

cmake_dependent_option(BASE64_WITH_NEON32 "add NEON32 codepath" OFF _TARGET_ARCH_arm OFF)
add_feature_info(NEON32 BASE64_WITH_NEON32 "add NEON32 codepath")

should probably be:

cmake_dependent_option(BASE64_WITH_NEON32 "add NEON32 codepath" ON _TARGET_ARCH_arm OFF)
add_feature_info(NEON32 BASE64_WITH_NEON32 "add NEON32 codepath")
BurningEnlightenment commented 2 years ago

IIRC we decided to disable NEON32 support by default even if the compiler supports it, because it cannot be detected at runtime whether or not the CPU supports the NEON extensions. Therefore you need to explicitly enable NEON32 support via the GUI or with the CLI like so:

cmake -G [...] -DBASE64_WITH_NEON32=ON ..

HTH

(and please open a new issue or discussion next time)

BurningEnlightenment commented 2 years ago

@aklomp I think we can conclude the stabilization phase now as there haven't been any new issues for a month now. So I don't think there is any point in delaying the release any further.

aklomp commented 2 years ago

I agree, let's get it done. It's on my to-do list for tomorrow.