Kl4rry / simp

🖼️ Simp is a fast and simple GPU-accelerated image manipulation program.
Apache License 2.0
299 stars 11 forks source link

These 2 pictures do no load in simp (but they load fine is MS Paint) #15

Closed olivier-fs closed 2 years ago

olivier-fs commented 2 years ago

Simp is great, and can load almost all my mp3 album covers ; but there are still some rough corners : These 2 pictures (cd covers, JPEG, 500x500, ~=150-200Kb) do not load in simp.

1st one : Error MessageBox https://ibb.co/V39PWPC 2nd one : Crashes the program https://ibb.co/GPdKc8N

If I find other examples of covers that trigger failures/crashes I'll report those too, so you can polish this nice viewer.

Kl4rry commented 2 years ago

Error message of front2.jpeg

PanicInfo { payload: Any { .. }, message: Some(byte index 20 is not a char boundary; it is inside '�' (bytes 18..21) of `��������� �������� ��������� ����������� �������� ACD Systems`), location: Location { file: "library\\core\\src\\str\\mod.rs", line: 127, col: 5 }, can_unwind: true }
Kl4rry commented 2 years ago

The front2.jpeg error is related to exif metadata parsing.

Kl4rry commented 2 years ago

front1.jpg might not be a valid jpg but I am not sure. I think jpgs should end with hex FF D9 but this one ends with FA 9D. The error message is IoError(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" }). I think that means that the decoder cannot find the end of the image.

olivier-fs commented 2 years ago

Yeah invalid input data... or worse : malicious input data (there were some terrible CVEs on Windows regarding malicious jpeg/pngs/fonts). Sorry I know nothing about (the many) JPEG format(s), but I understand your reply : If one of the utility crates (rexif, but the same would apply to say the image crate) returns an Err("invalid format/file"), and it is actually not a bug in this crate, then what ?

At least the error message for the first image proves you dutifully handled the Result<> ;-)

But for the 2nd image simp crashes... unwrap() here ?

Other apps :

I'm still very happy with simp, but my point is : oxydizing the desktop with rust apps requires topnotch usability, i.e. new apps being at least as nice to end users than the apps they are supposed to replace. Ok that easier to say it than to code it, and again: I know nothing about graphics/images, But I'm facing the same kind of problems when I replace old ASP/ASP.NET or apache/nginx+PHP code with axum/tonic + sqlx : nobody cares about the nice technology stack if I make "errors" "appear" where it used to be "valid" content, even if "valid" was not really what it is supposed to mean. Anyway feel free to close the issue, or handle it as low priority, as it is not a stopper.

olivier-fs commented 2 years ago

I invastigated a bit on the crash with the 2nd picture.

load_image.rs / load_uncached()

            if ExifTag::UnknownToMe != entry.tag && !HIDDEN_TAGS.contains(&entry.tag) {
                let text = entry.value_more_readable;

                //--------------------------------
                println!("----");
                use rexif::TagValue::*;
                use utf16string::{LE, WStr};
                let text: String = match &entry.value {
                    U8(v) => {
                        println!("Vec<U8>: {v:?}");
                        match std::str::from_utf8(text.as_bytes()) {
                            Ok(t) => t.to_string(),
                            Err(e) => {
                                println!("Invalid UTF8 : {e}");
                                "Invalid UTF8".to_string()
                            }
                        }
                    },
                    U16(v) => {
                        println!("Vec<U16>: {v:?}");
                        // TODO **** BE (Big Endian) (crates byte_order, encoding) ****
                        match WStr::<LE>::from_utf16(text.as_bytes()) {
                            Ok(w) => w.to_string(),
                            Err(e) => {
                                println!("Invalid UTF16 : {e}");
                                "Invalid UTF16".to_string()
                            }
                        }
                    },
                    U32(v) => {
                        println!("U32: {v:?}");
                        // TODO **** U32 validation ****
                        entry.value.to_string()
                    },
                    Ascii(v) => {
                        println!("Ascii: {v:?}");
                        if v.is_ascii() { v.to_string() } else { "Invalid ASCII".to_string() }
                    },
                    // TODO **** Other TagValue validation ****
                    _ => { println!("Other"); "Other".to_string() },
                };
                println!("{text}");
                //--------------------------------

                let short = if text.len() > 20 { &text[..20] } else { &text };
                metadata.push((entry.tag.to_string(), short.to_string()));
            }

The output is :

----
Vec<U16>: [1]
þæôµà▓µØ®þæ¿
----
Ascii: "´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢ ´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢ ´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢ ´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢ ´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢´┐¢ ACD Systems"
Invalid ASCII
----
Ascii: "2011:03:08 15:22:05"
2011:03:08 15:22:05
----
U32: [156]
156

So the data purporting to be ASCII is not ASCII, and taking a slice (&text[..20]) panics.

Output shows there are U16 (&Vec) / U32 (&Vec)...

Quick and dirty 1st version, using String. Could be optimized to use &str instead.

With thes checks the image loads and is displayed fine i.e as in Paint/GIMP/etc.

Will improve the code for checks later when I have time.

olivier-fs commented 2 years ago

Ok I think I nailed it. The problem is not with invalid ASCII, because it's still valid UTF8. It's related to slicing at hardcoded offset :

let short = if text.len() > e { &text[..e] } else { &text };

This works for all 1 byte charset strings (like ASCII), and many UTF8 strings from western languages because they only contain 1 byte chars, so 20 bytes means 20 chars. But for strings containing variable length 1/2/3/4 characters :

With front.2.jpg the character goes from offset 19 to 21, so slicing at 20 will make rust panic. 20 chars in this EXIF tag value is 56 bytes.

Proposed fix :

let offset_c20 = text.to_string().byte_index_from_char_index(20);
let short = if text.len() > offset_c20 { &text[..offset_c20] } else { &text };
Kl4rry commented 2 years ago

Oh the exif one was just me being lazy. It should use the unicode-segmentation crate to grab the first 20 glyphs instead of doing a string slice.

Kl4rry commented 2 years ago

The front1.jpg is harder to solve. I think it may be a "issue" in the image-jpeg crate. I think the decoder might be a little bit more strict with what it considers a valid jpg.

Kl4rry commented 2 years ago

Upstream issue related to front1.jpg deoding issue image-rs/jpeg-decoder#262. Exif error fixed in 3e9d8b4ef44bdd71b3d2f71f00afe7f4750eb4a6.