Closed vogelfr closed 9 months ago
The TIFF specs (EXIF data is actually a TIFF blob) describes the ASCII field type as:
8-bit byte that contains a 7-bit ASCII code; the last byte must be NUL (binary zero)
So it actually seems "illegal" to use the 8th bit.. but exifr is very
lenient (lazy) and simply allows full 8 bits so you can use
#force_encoding('UTF-8')
to fix up the string values.
Hi, I also ran into this problem, while writing from exif data into a Jekyll liquid tag, with exif comment containing characters such as [ɛ ơ ʉ ə ɲ], throwing the error:
Liquid Exception: incompatible character encodings: ASCII-8BIT and UTF-8
Please could you elaborate a little for me, where I could add or uncheck this option
#force_encoding('UTF-8')
in exifr?
Thanks
I assume you have found the exiftag tool that uses exifr in the background. In the jekyll-exiftag.rb I added the following:
begin
exif = EXIFR::JPEG::new(file_name)
ret = tag.split('.').inject(exif){|o,m| o.send(m)}
if ret.is_a? String # <----- FROM HERE
ret.force_encoding('UTF-8')
end # <----- TO HERE
return ret
This will ensure that for any kind of String returned by exifr the result will be read as UTF-8. I did still have some weird issue with a specific character sequence (\\xC3\\xB8
) but otherwise it works nicely.
Yes, I installed by gem install... the file looks like
begin
exif = EXIFR::JPEG::new(file_name)
return tag.split('.').inject(exif){|o,m| o.send(m)}
rescue
""
end
end
/var/lib/gems/3.1.0/gems/jekyll-exiftag-0.1.0/lib/jekyll-exiftag.rb
making the changes throws errors, perhaps you can attach yours?
sorry, entire blocks looks like this:
# try it and return empty string on failure
begin
exif = EXIFR::JPEG::new(file_name)
ret = tag.split('.').inject(exif){|o,m| o.send(m)}
if ret.is_a? String
ret.force_encoding('UTF-8')
end
return ret
rescue StandardError => e
puts e.message
end
File is here.
Got it! No errors thrown with those characters. Good one!
Hi,
When the EXIF data contains non-ASCII characters (e.g. 'é' or 'ø') they get improperly outputed:
image_description = "Troms\xC3\xB8"
instead ofimage_description = "Tromsø"
Is there any way to change it to output non-ASCII characters correctly?Many thanks :)