barasher / go-exiftool

Golang wrapper for Exiftool : extract as much metadata as possible (EXIF, ...) from files (pictures, pdf, office documents, ...)
GNU General Public License v3.0
250 stars 44 forks source link

Error when scanning files with special chars in filename #51

Closed Blesmol closed 2 years ago

Blesmol commented 2 years ago

Hi,

I'm currently using go-exiftool on windows, and everything is working fine so far, except when trying to call ExtractMetadata() on files where the path includes special characters, e.g. foö.jpg.

error during unmarshaling (Error: File not found - test/foö.jpg
): invalid character 'E' looking for beginning of value)

The same works fine when using exiftool.pl directly on the command line. Not sure whether that problem is limited to windows or would also appear on Mac or Linux...

Best regards, Philipp

barasher commented 2 years ago

Hi,

I've just tried on Linux and everything seems ok :

func TestDebug(t *testing.T) {
    e, err := NewExiftool()
    assert.Nil(t, err)
    defer e.Close()
    fm := e.ExtractMetadata("testdata/foö.jpg")
    assert.Equal(t, 1, len(fm))
    assert.Nil(t, fm[0].Err)
}
barasher@linux:~/go/src/github.com/barasher/go-exiftool$ go test -run TestDebug -v ./...
=== RUN   TestDebug
--- PASS: TestDebug (0.18s)
PASS
ok      github.com/barasher/go-exiftool 0.186s

Did you tried with the charset option ?

e, err := NewExiftool(Charset("filename=utf8"))

Keep me posted :)

Blesmol commented 2 years ago

Hi,

Ah, that charset option did it! Was not aware about that, thank you so much!

Best regards, Philipp

tauinger-de commented 2 years ago

That helped me a lot, too. Thanks.

Is there any downside to using UTF-8...? Is there a reason its not default?

barasher commented 1 year ago

Hi, Sorry for the delay ! I did not specified utf-8 charset because of backward compatibility. The first go-exiftool's releases used exiftools default charset value. And when I introduced the Charset specification option, I did not want to risk any regression issues for all the people that were using go-exiftool, so I kept the same behaviour, no specifying any charset by default.