Add compile time cpu feature detection when `no_std`

MarcusGrass commented 1 year ago

Using avx instructions makes a big difference, being able to use them no-std would be cool. This PR attempts to add them by doing that feature detection at compile time if running no_std.

The feature selection is a bit convoluted but it's supposed to be exclusive if I've got this right:

We only import avx on std, or if we've got avx2 instructions.
We select avx on no_std if we have avx2 instructions.
We select sse2 on no_std if we have sse2 instructions but not avx2 instructions.
We fallback on no_std if we don't have avx2 or sse2 instructions.

It's fairly difficult to tell if it's working, what I did was benchmark std, then I benchmarked with memchr no-default-features, then I benchmarked RUSTFLAGS='-C target-cpu=native' with no-default-features on a cpu that has avx2. The first and third bench was about on par, whereas the second had about half the throughput.

Edit: Woops, just saw https://github.com/BurntSushi/memchr/pull/106, but from reading the comments, this might be a good compromise.
The use case is essentially micro optimization with dubious motivation.

BurntSushi commented 1 year ago

Thanks! This PR is pretty simple so I think I'm likely to take it, but could you say what you're using this for? In particular, the use case for no_std on x86_64.

MarcusGrass commented 1 year ago

Sure, I'm prototyping a minimal x86/aarch64 std-library for linux, it makes most simple CLI applications extremely small in comparison to using std, and in my still minimal testing, a bit faster. I'd like to use memchr since the Linux-APIs are c-centric so to make sure input is valid searching for null-terminators on user-provided input is often required.

It's a circle where the reason for a tiny-std is to make tiny apps, and my tiny apps on their own would also benefit from memchr since they're often byte-centric, and those are no-std because of using tiny-std.

Edit: The point that I'm creating problems for myself with vague benefits is not lost on me. In my opinion rejecting the PR because of vague motivations is completely reasonable, the change is so small that I could easily just maintain a fork with this change saving you the problem of a version change and possibly obscure compilation error-reports because of a bad cfg-directive.

osiewicz commented 1 year ago

To add to the list of potential use cases, I have a Windows kernel driver with std disabled which utilizes memchr. Nice PR @MarcusGrass !

BurntSushi / memchr

Add compile time cpu feature detection when `no_std` #120