rwcarlsen / goexif

Decode embedded EXIF meta data from image files.
BSD 2-Clause "Simplified" License
627 stars 134 forks source link

Implement a lazy tiff loader #60

Closed tjamet closed 5 years ago

tjamet commented 6 years ago

Implement a lazy tiff loader

When using tiff files like raw files, the time required to load the whole file is significant. We do not load the whole picture frame in-memory in order to access EXIF values. We do not even need to load all EXIF values when we need to access a single EXIF field.

To improve the tiff loading speed, opportunistically read from the source file when needed.

This is a significant change in the interface as it requires to pass a ReaderAt instead of a Reader.

In order not to break backward compatibility, a LazyDecode function has been implemented to keep the same interface for the Decode one, including filling every Tag.

There are 2 places where the backward compatibility is broken:

Benchmark results (on SSD):

EXIF parsing

goos: darwin
goarch: amd64
pkg: github.com/rwcarlsen/goexif/exif
BenchmarkDecode-8                 200000             89622 ns/op          133531 B/op        236 allocs/op
BenchmarkDecodeRaw-8               30000            486907 ns/op          341765 B/op       8553 allocs/op
BenchmarkLazyDecode-8             200000             83383 ns/op           31659 B/op        151 allocs/op
BenchmarkLazyDecodeRaw-8          200000             81325 ns/op           25051 B/op        164 allocs/op
PASS
ok      github.com/rwcarlsen/goexif/exif        72.727s

Before update, the benchmark results for exif parsing were:

goos: darwin
goarch: amd64
pkg: github.com/rwcarlsen/goexif/exif
BenchmarkDecode-8         100000            170564 ns/op          342413 B/op        961 allocs/op
BenchmarkDecodeRaw-8         100         112034369 ns/op        750784604 B/op      9244 allocs/op
PASS
ok      github.com/rwcarlsen/goexif/exif        34.500s

TIFF decoding

goos: darwin
goarch: amd64
pkg: github.com/rwcarlsen/goexif/tiff
BenchmarkDecode-8                    300          46751365 ns/op        268452505 B/op       113 allocs/op
BenchmarkLazyDecode-8             300000             43886 ns/op           10645 B/op         82 allocs/op
PASS
ok      github.com/rwcarlsen/goexif/tiff        32.337s
tjamet commented 6 years ago

I have changed the implementation so that the interface is kept, and added a test to actually ensure that, after Decode, the Raw field has a valid tiff structure

rwcarlsen commented 6 years ago

Thanks! I'll try and take a look sometime this week.