golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.1k stars 17.55k forks source link

x/image/tiff: sony .arw files decode as a 0x0 image.Gray #33708

Open paolobarbolini opened 5 years ago

paolobarbolini commented 5 years ago

What version of Go are you using (go version)?

$ go version
go version go1.13beta1 windows/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\Paolo\AppData\Local\go-build
set GOENV=C:\Users\Paolo\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\Paolo\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Users\Paolo\Desktop\test\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\Paolo\AppData\Local\Temp\go-build477435338=/tmp/go-build -gno-record-gcc-switches

What did you do?

I installed the latest golang.org/x/image I tried it on multiple .ARW files from different models but the issue occurs with all of them.

Sample image: _DSC4438.zip

package main

import (
        "fmt"
        "image"
        "os"

        _ "golang.org/x/image/tiff"
)

func main() {
        f, err := os.Open("_DSC4438.ARW")
        if err != nil {
                panic(err)
        }

        img, format, err := image.Decode(f)
        f.Close()
        if err != nil {
                panic(err)
        }

        fmt.Printf("format: %s image: %#v width: %d height: %d", format, img, img.Bounds().Dx(), img.Bounds().Dy())
}

Result of running the code:

format: tiff image: &image.Gray{Pix:[]uint8{}, Stride:0, Rect:image.Rectangle{Min:image.Point{X:0, Y:0}, Max:image.Point{X:0, Y:0}}} width: 0 height: 0

What did you expect to see?

I expected to get an error when trying to decode the Sony .ARW raw image file, since Go doesn't know how to decode it.

What did you see instead?

The image decode function returned no errors, and the image is incorrectly decoded as a 0 pixel image:

&image.Gray{Pix:[]uint8{}, Stride:0, Rect:image.Rectangle{Min:image.Point{X:0, Y:0}, Max:image.Point{X:0, Y:0}}}

The issue seems to have been introduced by https://github.com/golang/image/commit/7e034cad644213bc79b336b52fce73624259aeca since https://github.com/golang/image/commit/92942e4437e2b065806587df0f5d8afa565a8567 instead returns the following error when trying to decode the raw image:

tiff: invalid format: BitsPerSample tag missing
bcmills commented 5 years ago

CC @nigeltao @bsiegert, @hhrutter

bcmills commented 5 years ago

Possibly related: #11391, #11386.

bsiegert commented 5 years ago

It turns out that .ARW files are TIFF files, in the sense that they use a TIFF container. This is what libtiff says about your file:

$ tiffdump Downloads/_DSC4438.ARW 
Downloads/_DSC4438.ARW:
Magic: 0x4949 <little-endian> Version: 0x2a <ClassicTIFF>
Directory 0: offset 8 (0x8) next 38402 (0x9602)
SubFileType (254) LONG (4) 1<1>
Compression (259) SHORT (3) 1<6>
ImageDescription (270) ASCII (2) 32<                         ...>
Make (271) ASCII (2) 5<SONY\0>
Model (272) ASCII (2) 10<ILCE-6000\0>
Orientation (274) SHORT (3) 1<1>
XResolution (282) RATIONAL (5) 1<350>
YResolution (283) RATIONAL (5) 1<350>
ResolutionUnit (296) SHORT (3) 1<2>
Software (305) ASCII (2) 16<ILCE-6000 v3.20\0>
DateTime (306) ASCII (2) 20<2017:08:14 17:56:03\0>
Whitepoint (318) RATIONAL (5) 2<0.313 0.329>
PrimaryChromaticities (319) RATIONAL (5) 6<0.64 0.33 0.21 0.71 0.15 0.06>
SubIFD (330) LONG (4) 1<141174>
JPEGInterchangeFormat (513) LONG (4) 1<141986>
JPEGInterchangeFormatLength (514) LONG (4) 1<419743>
YCbCrCoefficients (529) RATIONAL (5) 3<0.299 0.587 0.114>
YCbCrPositioning (531) SHORT (3) 1<2>
34665 (0x8769) LONG (4) 1<560>
50341 (0xc4a5) UNDEFINED (7) 106<0x50 0x72 0x69 0x6e 0x74 0x49 0x4d 00 0x30 0x33 0x30 0x30 00 00 0x3 00 0x2 00 0x1 00 00 00 0x3 00 ...>
50740 (0xc634) BYTE (1) 4<0x80 0xb8 00 00>

Directory 1: offset 38402 (0x9602) next 0 (0)
SubFileType (254) LONG (4) 1<1>
Compression (259) SHORT (3) 1<6>
ImageDescription (270) ASCII (2) 32<                         ...>
Make (271) ASCII (2) 5<SONY\0>
Model (272) ASCII (2) 10<ILCE-6000\0>
Orientation (274) SHORT (3) 1<1>
XResolution (282) RATIONAL (5) 1<72>
YResolution (283) RATIONAL (5) 1<72>
ResolutionUnit (296) SHORT (3) 1<2>
Software (305) ASCII (2) 16<ILCE-6000 v3.20\0>
DateTime (306) ASCII (2) 20<2017:08:14 17:56:03\0>
JPEGInterchangeFormat (513) LONG (4) 1<38676>
JPEGInterchangeFormatLength (514) LONG (4) 1<7124>
YCbCrPositioning (531) SHORT (3) 1<2>

Decoding as a zero-pixel grayscale image is probably a mistake. The file contains two images, for some reason both are JPEG. I thought this was a RAW format? These are probably previews (350dpi and 72dpi) or something, the 0xc634 tag (DNGPrivateData) is perhaps a pointer to the actual raw data.

AIUI, the images have no width and height, since they are supposed to contain a JPEG header, for which they give offset and size. Unfortunately, they use the JPEG tags in a deprecated way.

I suppose we could try decoding the inner JPEG with the image/jpeg decoder and returning the first image. Would that make sense?

paolobarbolini commented 5 years ago

I think there are a lot of ways this could go, but thinking about it the best and simplest option here would be to have it simply error out instead of decoding it into an empty image.

The only reason I encountered this problem is because I have a Go service which takes images from an input and tries to make a thumbnail out of them. It first sends the image into image.Decode and if it fails it figures it might be a RAW file and tries converting it with dcraw -e filename (for .ARW files image.Decode doesn't respond with and error, so my program didn't get a chance to try with dcraw, and that's how this problem started).

It would be cool if Go supported extracting thumbnails for those TIFF based raw formats, but trying to decode my file in other TIFF decoders resulted in the following error:

TIFF directory is missing required "ImageLength" field.

I'm not an expert in image formats, but this error together with the fact that as you said JPEG tags are used in a deprecated way, would make me think it would not be worth it to support those edge cases only to be able to decode some raw formats but still having to send the rest of them to a raw decoder.

bcmills commented 5 years ago

I would argue that, ideally, either the image should decode into the actual raw-encoded pixels, or attempting to open it should fail with some sort of explicit error.

Unpacking the inner JPEG seems dangerous: it would appear to work, but would (arguably unexpectedly) produce a lossy image when the user is expecting a lossless one.