Open lemire opened 7 months ago
I was thinking a little about API. My generic proposal is providing a convenient wrapper that would work incrementally. I mean: user provides partial data (like input buffer when reading from file) and output buffer of fixed size. Using for decoding would be something like:
auto decoder = Base64Decoder::new();
std::string input;
input.resize(32 * 1024);
std::string output;
output.resize(16 * 1024);
while (/**/) {
// read a few kilobytes data from into `input`
const size_t bytes_stored = decoder.decode(input.data(), input.size(), output.data(), output.size());
// bytes_stored will never be greater than output.size()
write (output.data(), bytes_stored)
if input file reached EOF {
while (decoder.pending_output()) {
const size_t bytes_stored = decoder.flush(output.data(), output.size());
write (output.data(), bytes)
}
}
}
Of course this flexibility is at cost of performance, but my gut feeling is that if somebody want to process data in chunks, than problem is likely I/O bound.
Another thing for base64 encoding - it would be practical if we allowed wrapping output, for instance:
const size_t max_line_length = 72;
const char* separator = "\n";
encode(input, output, max_line_length, separator);
Again, nobody would expect that this variant will be as fast as the plain encoding.
@WojciechMula I'm pinging you later today as I have a major upgrade to the base64 support, with a slightly improved API.
Please see https://github.com/simdutf/simdutf/pull/382 where the base64 API was slightly extended (i.e., we have _safe
functions).
We should also provide base64_to_binary(const char input) -> std::vector, that does calculation of safe size and allocate memory internally. Or maybe something like base_to_binary(const char input, cont: &Container) and static_assert that the Container has method resize. (credit: @WojciechMula)