Remove dependence on 64-bit integers (for ANSI-C Compilers without a 64 bit integer type)

alexriegler12 commented 1 year ago

The current implementation is dependent on 64-bit integers. This does not work on many (old) ANSI-C compilers that don't have a 64-bit integer type such as long long. I have seen that the format itself does not need 64-bit integers at all. Is it possible to make the implementation independent of 64-bit ints and therefore much more compatible with a wide range of systems?

mattdesl commented 1 year ago

This would also be nice for the JS encoder. My current implementation is probably going to have the same bug as #13 for some audio inputs, since it doesn't use native BigInt. I could probably change this but it will likely slow it down and introduce different JS engine/browser requirements.

phoboslab commented 1 year ago

Is it possible to make the implementation independent of 64-bit ints

It's certainly possible, but I have no interest to do so here. One of the main goals for this implementation is to remain readable.

This would also be nice for the JS encoder.

How is your implementation dependent on this one?

Regarding https://github.com/phoboslab/qoa/issues/13 - no, your JS implementation does not have this issue. JS' Number type will automatically use a double precision float internally when it would overflow a 32 bit int.

alexriegler12 commented 1 year ago

Is it possible to make the implementation independent of 64-bit ints

It's certainly possible, but I have no interest to do so here. One of the main goals for this implementation is to remain readable.

This would also be nice for the JS encoder.

How is your implementation dependent on this one?

Regarding #13 - no, your JS implementation does not have this issue. JS' Number type will automatically use a double precision float internally when it would overflow a 32 bit int.

It would be possible to use a preprocesor define like QOA_USE_32 (for example, Lua does it that way) to be able to switch to a second Implementation that does Not Need 64bit ints. This implementation would need to cobble the 3-bit value that is Split between the two 32-bit values together by bit-shifting. The original int64 dependent Implementation would remain readable, as you said. but the library would be universally compatible without needing C99 int64 types or implementation-defined 64-Bit types (e.g. Win32 __int64) , in the true ANSI C manner. It would be so much more elegant.

BrettRToomey commented 1 year ago

@alexriegler12 why not just fork the project and have your 32bit implementation?

shuckster commented 1 year ago

The original int64 dependent Implementation would remain readable, as you said.

This implementation would need to cobble the 3-bit value that is Split between the two 32-bit values together by bit-shifting

It would be so much more elegant.

Just wondering how you're reconciling these points? From what I understand an ANSI C version would be more compatible, but also more difficult to read.

I don't think there's any argument that a more portable version of the project would be, well, more portable.

But if a goal of this particular project is also readability, I'm not sure how this is being honoured when preprocessors are being introduced?

alexriegler12 commented 1 year ago

The original int64 dependent Implementation would remain readable, as you said.

This implementation would need to cobble the 3-bit value that is Split between the two 32-bit values together by bit-shifting

It would be so much more elegant.

Just wondering how you're reconciling these points? From what I understand an ANSI C version would be more compatible, but also more difficult to read.

I don't think there's any argument that a more portable version of the project would be, well, more portable.

But if a goal of this particular project is also readability, I'm not sure how this is being honoured when preprocessors are being introduced?

This implementation already uses many preprocessor defines. I just thought it would be good if this implementation was as portable as the QOI library, which i have tested on consoles and MCUs. If there really is no one who wants to implement a 32-bit-only version, i could eventually do this myself, but i am not very good at editing code of other people.

kleinesfilmroellchen commented 1 year ago

If there really is no one who wants to implement a 32-bit-only version, i could eventually do this myself, but i am not very good at editing code of other people.

I don't see the problem here. While it is convenient to do 64-bit reads to load QOA, it is not at all required. My own implementation (SerenityOS's QOALoader) uses a mixture of 64-bit reads where manual unpacking is practical (such as for slices), and smaller-sized reads for anything where that's not a concern.

From experience, I can say that loading QOA with your own implementation is rather simple once you get some of the fine math details right. I suggest writing your own implementation that serves your needs, and for microcontrollers and old C compilers you will have an easy time forking the reference implementation.

alexriegler12 commented 1 year ago

How can I implement a 64bit-less version that is still fast? I would have to decode the slices dfferently (one residual value is split between two 32-bit values), and i think that would be slow.

chocolate42 commented 1 year ago

It should barely be slower, at least not because it's split. r9, the 10th residual, would be split and can be combined with ((a&1)<<2)|(b>>30), where a and b are the first and second half of a slice respectively.

One way to implement it would be to modify qoa_read_u64 into something like this

typedef unsigned long qoa_uint32_t;

static inline void qoa_read_u64(qoa_uint32_t *ret, const unsigned char *bytes, unsigned int *p)

Where ret can hold two values and is where the return value is stored. You'd then have to modify every line that extracts data from values returned by read_u64 to use the 32 bit returns. Then do something similar for write_u64. There's probably a cleaner way to do it, this is a naive 1:1 mapping (edit: and the details may be wrong, untested).

alexriegler12 commented 1 year ago

I think i am able to implement a 32-bit only frame Decoder (it's less than 100 lines in the reference implementation), but I fear it would be slow due to the needed branch statements in the slice decoding process (if the residual is in the first 32 bits, if it's split or if it is in the tail end 32 bits). Is there a faster way ?

chocolate42 commented 1 year ago

Where the reference does it in a single loop:

for (int si = slice_start; si < slice_end; si += channels) {
    int predicted = qoa_lms_predict(&qoa->lms[c]);
    int quantized = (slice >> 57) & 0x7;
    int dequantized = qoa_dequant_tab[scalefactor][quantized];
    int reconstructed = qoa_clamp(predicted + dequantized, -32768, 32767);

    sample_data[si] = reconstructed;
    slice <<= 3;

    qoa_lms_update(&qoa->lms[c], reconstructed, dequantized);
}

You'll instead need to loop over the first 9 samples, then handle the 10th manually, then loop over the last 10 samples. No branching.

pfusik commented 1 year ago

My alternative implementation of the decoder uses no 64-bit integers and no floating-point.

It would be much harder to do in the encoder because it performs 64-bit arithmetic.

phoboslab commented 1 year ago

It would be much harder to do in the encoder because it performs 64-bit arithmetic.

You mean the current_error += error * error; computation and comparison with best_error? This could be done with floats or lower precision integers. It does not have to be bit exact with what this encoder here produces, as long as it results in a valid QOA file.

pfusik commented 1 year ago

I'm happy with simple code that works well on all modern hardware.

I don't quite understand what's the goal of this ticket? What exact pre-C99 compiler is to be supported? Does it have floats? Is it C89 or K&R C?

phoboslab / qoa

Remove dependence on 64-bit integers (for ANSI-C Compilers without a 64 bit integer type) #15