mitschabaude / fast-base64

Fastest base64 on the web, with Wasm + SIMD
MIT License
50 stars 1 forks source link

no domain error checks #4

Open jerch opened 1 year ago

jerch commented 1 year ago

It seems the SIMD version does no domain error checks while decoding?

Imho you should at least mention that in the readme, so ppl are aware of this limitation and dont run into follow-up bugs by decoded bytes from broken base64 data.

On a sidenote - proper base64 decoding with SIMD is a remarkably tough issue - see the works of Lemire and Mula. Furthermore none of their speedy solutions works with wasm-simd, as they rely on 256 or 512 bit width.

mitschabaude commented 1 year ago

When you say "domain error checks", are you referring to something like erroring on invalid characters in the input?

jerch commented 1 year ago

Yepp, anything beyond the base64 alphabet chars.

mitschabaude commented 1 year ago

Interesting, yeah I have to check out Lemire. You're right, I guess my implementation just applies some piece wise linear mapping to the input characters after they've been interpreted as bytes by the built-in JS text encoder, to map them to the 0-63 range

mitschabaude commented 1 year ago

Which means characters out of the expected range will produce some random garbage

mitschabaude commented 1 year ago

However, I think I could just do a few more lanewise gt / lt instructions to check the bounds, and then error if any of them are true. Shouldn't add much overhead

jerch commented 1 year ago

Yeah, you prolly can simply aggregate error conditions in another v128 variable, which only needs to be eval'ed once at the end. Would be enough for a good vs. bad data distinction (unless you want to return the position of the first faulty byte).

I dont recommend to introduce branching in the data loop (means early error exit), this is really a perf killer in SIMD realms.

jerch commented 1 year ago

Hmm I get almost no speed benefit with SIMD + error check compared to a uint32 LUT variant (its ~1.5 GB/s SIMD vs ~1.4 GB/s LUT). Still investigating whether SIMD can be further optimized (and I cannot go w'o error check, as the data comes untested from outside).

jerch commented 1 year ago

Update: With Mulas pshufb variant I get it to >2.5 GB/s (correct error detection still untested). All other variants are much worse. The higher bit variants by Lemire are not applicable due to a lack of 256/512 SIMD.

jerch commented 1 year ago

Here is my adoption of Mulas ideas with error checking: https://github.com/jerch/xterm-addon-image/blob/de66af18e1b06e0d244ae3632c4d308950c69b15/src/base64.wasm.ts#L102 (Its commented out, since I wont use it until Safari gets wasm-simd sorted out.)