felixc / rexiv2

Rust library for read/write access to media-file metadata (Exif, XMP, and IPTC)
GNU General Public License v3.0
79 stars 17 forks source link

Segmentation fault from get_tag_multiple_strings #27

Closed eigan closed 6 years ago

eigan commented 6 years ago

I am learning Rust, so I might be doing something wrong.

libgexiv2: 0.10.7
exiv2: 0.26-2
rust: 1.23.0
let file2 = "image.JPG";
let tag = "Exif.Image.DateTime";
let meta = rexiv2::Metadata::new_from_path(&file2).unwrap();
println!("{:?}", meta.get_tag_multiple_strings(tag));  // Line 16
Fresh libc v0.2.36
Fresh num-traits v0.1.42
Fresh regex-syntax v0.4.2
Fresh utf8-ranges v1.0.0
Fresh lazy_static v1.0.0
Fresh pkg-config v0.3.9
Fresh void v1.0.2
Fresh memchr v2.0.1
Fresh num-integer v0.1.35
Fresh unreachable v1.0.0
Fresh aho-corasick v0.6.4
Fresh num-rational v0.1.40
Fresh thread_local v0.3.5
Fresh gexiv2-sys v0.7.0
Fresh regex v0.2.5
Fresh rexiv2 v0.5.0
Fresh mediasort-rs v0.1.0 (file:///home/einar/projects/eigan/mediasort-rs)

Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/mediasort-rs`
[1]    27236 segmentation fault (core dumped)  cargo run -v
Starting program: /home/einar/projects/eigan/mediasort-rs/target/debug/mediasort-rs 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00005555555826b1 in rexiv2::Metadata::get_tag_multiple_strings (self=0x7fffffffd618, tag="Exif.Image.DateTime")
    at /home/einar/.cargo/registry/src/github.com-1ecc6299db9ec823/rexiv2-0.5.0/src/lib.rs:551
551             while !(*c_vals.offset(cur_offset)).is_null() {
(gdb) backtrace
#0  0x00005555555826b1 in rexiv2::Metadata::get_tag_multiple_strings (self=0x7fffffffd618, tag="Exif.Image.DateTime")
    at /home/einar/.cargo/registry/src/github.com-1ecc6299db9ec823/rexiv2-0.5.0/src/lib.rs:551
#1  0x000055555556fb09 in mediasort_rs::main () at src/main.rs:16

image

felixc commented 6 years ago

Good catch! Thank you for reporting this. It's definitely a bug in the library and not something you're doing wrong. (And thank you for the perfect bug report with a stack trace and repro example!)

It looks like the code handles empty result arrays fine, but doesn't handle null results where there is no array at all. The specific code where the problem lives is:

https://github.com/felixc/rexiv2/blob/d3988f7b6a2e10ddd5ac99d32855e54a5d17425d/src/lib.rs#L548-L563

The problem is that c_tags is not confirmed to not be null before being dereferenced. Adding something like this should take care of it:

if c_tags.is_null() {
    return Ok(tags);
}

Or possibly you might find it cleaner to update the loop conditional to

while !c_tags.is_null() && !(*c_tags.offset(cur_offset)).is_null() {

...but I don't know if the compiler is smart enough to not actually check that condition every loop, so maybe it's not optimal.

The same pattern appears in a few other places, e.g. lines 416, 449 and 480.

If you'd like to submit a patch, that's probably the fastest way to get you a fix immediately, and I'll cut a new release as soon as I can — otherwise I'll try to get a fixed version out this weekend.

eigan commented 6 years ago

Looks like what I wanted was to use the method get_tag_string, which gives the expected result.

I applied your patch and it no longer crashes, but the value is Ok([]), not sure if this is correct as the value from get_tag_string is Ok("2016:01:10 13:27:20")

Edit: probably correct

felixc commented 6 years ago

Yup; I think the root cause of the confusion is that get_tag_multiple_strings is meant to be used for metadata that represents a "collection" (e.g. Xmp.dc.creator, which is of type "XmpSeq").

The "Exif.Image.DateTime" used in this example isn't a sequence, but a single value — so it is correct that get_tag_multiple_strings returns nothing but get_tag_string does return a value. Or at least, it's using the gexiv2 API correctly, but maybe the API should be rethought to be less confusing.