API and Error - Githubissues

localthomas commented 3 years ago

This is sort of a question/discussion issue: The current function declarations for compressing and decompressing are missing Error return types or do not specify when they panic in the documentation.

Take the function definition of decompress as an example:

pub fn decompress(self, data: &[u8], width: usize, height: usize, output: &mut [u8])

It is possible to call this function with invalid parameter combinations, e.g. width and height do not match the data length. With the current definition, only panics are supported without any further detail about what's wrong.

I think the ergonomics could be improved by either adding a return value or at least specifying the situations under which a panic occurs in the documentation. I would prefer the first idea, but I am not sure how exactly the return type should look like. Maybe taking inspiration from image could help.

Context: I am currently trying to fuzz the decompress function and get panics everywhere.

jansol commented 3 years ago

I haven't been entirely happy with the API myself either, but haven't had the time to properly think about what I would prefer it to look like. One major issue is that it requires expanding everything to be RGBA and will always decompress to RGBA, which adds unnecessary overhead as noted in the integration discussion over at the image crate. I was hoping to get some feedback from them about what kind of API they would prefer.

localthomas commented 3 years ago

I looked into the current implementation of the DXT module in the image crate and propose to add the following function for decoding:

#[non_exhaustive]
pub enum TodoError {
    /// The input for decoding or encoding did not meet the requirements for the variant.
    InvalidInputSize(usize),
    /// The output buffer for encoding or decoding did not meet the requirements for the variant.
    InvalidOutputSize(usize),
    // ...
}

impl Format {
    fn decompress(
        self,
        data: &[u8],
        width: usize,
        height: usize,
        output: &mut [u8],
    ) -> Result<(), TodoError>;
}

I am still unsure about the Error type, but this is probably the easiest solution.

For integration into the image crate, the current read_scanline could be modified to use the aforementioned function instead (with a height of 4). And maybe squish::Format could be used as a replacement for image::DxtVariant. I would also suggest adding #[non_exhaustive] to squish::Format if a future version adds more formats.

I did not look into the encoding details yet, but it should be very similar.

One major issue is that it requires expanding everything to be RGBA and will always decompress to RGBA

I think this should just be changed to properly output with the correct byte stride for the format. The function decompressed_bytes_per_pixel(self) -> usize (or something similar; e.g. 3 for BC1) should be added to squish::Format. This is the requirement to make the method of integration of above possible. This is possibly the most work, as it requires changing decompress_block and compress_block_masked. It would propose to make these two functions private for any new versions. External users can always use the decompress and compress functions with sizes of 4×4.

With your blessing, my next steps would be to first

Implement the Error type and add it to the decompress function
Add checks for the most obvious invalid parameter combinations
Create a simple fuzzing setup with cargo fuzz for decompression

Then I would try to switch from RGBA only modes to their proper ones (RGB, RGBA, RG, R).

jansol commented 3 years ago

Sure, go ahead. It is probably easiest to evaluate the options by seeing what they would look like in practice anyway.

I also plan to add a Bc1Alpha variant since OpenGL has two separate constants for DXT1 with and without transparency and are probably some gains to be had from being able to use the implicit black value in the palette. I just haven't got to actually doing that optimization in the encoder yet. If you are reworking the channel modes from RGBA-only it might make sense to add that format variant and just stub it out with unimplemented!() for now.

jansol / texpresso

API and Error #3