lachaize / jbrout

Automatically exported from code.google.com/p/jbrout
0 stars 0 forks source link

Corrupted UTF-8 confuses jbrout to no end (aka fixing corrupted UTF-8 ain't be easy) #161

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When trying to fix images with corrupted UTF-8, jbrout with pyexiv2 0.2.2 got 
really confused.

I suspect main damage was done by exiftool. Attached error dialogs from 
exiftool happened after I run Delete Tag and tried to remove non-sensical tags 
from the images.

What is the expected output? What do you see instead?
selected tag is deleted whatever its encoding is

What version of the product are you using? On what operating system?
jakoubek:~$ rpm -q jbrout pyexiv2 exiv2
jbrout-0.3.284-1.1.svn305.fc14.noarch
pyexiv2-0.2.2-1.fc15.x86_64
exiv2-0.20-1.fc14.x86_64
jakoubek:~$ rpm -qf /usr/bin/exift
exiftool  exiftran  
jakoubek:~$ rpm -qf /usr/bin/exiftool 
perl-Image-ExifTool-8.25-1.fc14.noarch
jakoubek:~$ uname -a
Linux jakoubek.ceplovi.cz 2.6.35.2-9.fc14.x86_64 #1 SMP Tue Aug 17 22:36:15 UTC 
2010 x86_64 x86_64 x86_64 GNU/Linux
jakoubek:~$ 

Please provide any additional information below.
Actually the issue may lie somewhere lower than in jbrout/pyexiv2 ... gthumb 
(using libexiv2.so.9 directly) seems to be confused by these tags as well.

Original issue reported on code.google.com by matej.c...@gmail.com on 27 Aug 2010 at 8:44

Attachments:

GoogleCodeExporter commented 9 years ago
Wouawwwwwwwwwwwwwwwwwwww !!!!

I had downloaded your pictures, and tested by my own ;-)

You should delete XMP in command line, like this

exiv2 -d x p20090319_061423.jpg

Refreshing it under jbrout

Add a normal tag under jbrout ... and remove it (it will rebuild xmp using iptc)

It works for me (but I lost the exif thumbnail ?!?)

Original comment by manat...@gmail.com on 27 Aug 2010 at 4:42

GoogleCodeExporter commented 9 years ago
In your case ...

You should delete all XMP in all your pictures (see above ^), if you don't use 
other XMP informations (than keywords/tags).
Refresh all your collection.
Clean the tree tags.

To tell jbrout, to rewrite xmp tags (based on iptc tags), you'll need to add a 
"fake tag", and remove it.

If many people are in the same case, we should consider to write a plugin to 
repair that.

Original comment by manat...@gmail.com on 27 Aug 2010 at 4:55

GoogleCodeExporter commented 9 years ago
a) if I just 
find Pictures/ -name \*.jpg -exec exiv2 -d x '{}' \;
rm ~/.jbrout/db.xml
and reimported whole Pictures/ again, would it recreate correct XMP tags?

Original comment by matej.c...@gmail.com on 27 Aug 2010 at 4:59

GoogleCodeExporter commented 9 years ago
I tried exactly that and it didn't help ... see attached picture

Original comment by matej.c...@gmail.com on 27 Aug 2010 at 5:12

Attachments:

GoogleCodeExporter commented 9 years ago
after a brief test with one picture it seems, that exiv2 -d x is not able to 
delete this kind of trash

Original comment by matej.c...@gmail.com on 27 Aug 2010 at 5:13

GoogleCodeExporter commented 9 years ago
exiv2 -d a works, but that's kind of too drastic for me

Original comment by matej.c...@gmail.com on 27 Aug 2010 at 5:15

GoogleCodeExporter commented 9 years ago
In the end I've found that the corrupted were not XMP tags, but 
Iptc.Application2.Keywords ones and the attached script (knownKeywords.txt was 
created by editing ~/.jbrout/tags.xml to simple list of known tags) helped me 
to fix all my pictures.

Original comment by matej.c...@gmail.com on 25 Oct 2010 at 10:26

GoogleCodeExporter commented 9 years ago
Actually, it is even more complicated. I think the root of all sins is exiftool 
(or the way jbrout uses it).

See the attached scenario. After playing with tags in jbrout (SVN checkout 316 
with pyexiv2 0.2.2, and exiv2 0.20; adding, removing etc. ... during which I 
got errors like shown on the attached "Snímek obrazovky-Jbrout Error.png" 
screenshot) I've got image with XMP tags with corrupted encoding (see 
beforeJbrout.txt log). So, I have removed completely XMP tags via exiv2 (still 
see in log), and now the image has no XMP tags, but otherwise is OK. Actually, 
it isn't ... why are all tags duplicated? Anyway, good enough for me.

I have opened the directory with only this image in jbrout and added and 
removed some other tag (while getting removeTag.png error). Now, I have 
refreshed the directory and three new tags with corrupted UTF-8 encoding have 
been created as Imported Tags.

Now look at the afterJbrout.txt log. Encoding in XMP tags is corrupted again.

I think the answer is that

a) either we don't use exiftool properly (manpage says that UTF8 should be 
default value for -charset parameter though), or exiftool (at least my version 
perl-Image-ExifTool-8.25-1, but that's the latest on CPAN 
http://search.cpan.org/~exiftool/Image-ExifTool-8.25/exiftool) is broken,
b) why do we use exiftool for adding/removing tags at all, when pyexiv2 0.2 can 
do it as well?
c) there is something broken in the way libexiv2 deals with Olympus pictures 
(that should probably go upstream), however still exiv2 command line tool is 
apparently able to deal with tags and UTF-8 encoding just fine, so hopefully 
pyexiv2 should as well.

Original comment by matej.c...@gmail.com on 30 Oct 2010 at 12:53

Attachments:

GoogleCodeExporter commented 9 years ago
Any possible link with what I described in issue 150 
(https://code.google.com/p/jbrout/issues/detail?id=150) ? In some tests, I had 
noticed that JBrout could not remove all XMP tags (last removal failed).

Original comment by chartier...@gmail.com on 3 Nov 2010 at 8:55

GoogleCodeExporter commented 9 years ago
I wonder whether we are calling exiftool correctly. Do we properly encode 
non-ASCII characters?

Original comment by matej.c...@gmail.com on 3 Dec 2010 at 9:40

GoogleCodeExporter commented 9 years ago
http://owl.phy.queensu.ca/~phil/exiftool/faq.html#Q10 looks relevant

Original comment by matej.c...@gmail.com on 3 Dec 2010 at 9:41

GoogleCodeExporter commented 9 years ago
I think that answer to this (and other) issue is in bug# 129 ... just don't use 
exiftool at all. Patch is there provided.

Original comment by matej.c...@gmail.com on 4 Jul 2011 at 7:03

GoogleCodeExporter commented 9 years ago
Is this issue now fixed with revision r335 which closed bug #129

Original comment by r...@wallace.gen.nz on 7 Jul 2011 at 5:44

GoogleCodeExporter commented 9 years ago

Original comment by r...@wallace.gen.nz on 7 Jul 2011 at 5:49

GoogleCodeExporter commented 9 years ago
Yes, it is ... exiftool is the root of all evil (or the way we used it).

Original comment by matej.c...@gmail.com on 7 Jul 2011 at 8:41

GoogleCodeExporter commented 9 years ago
Thanks for the confirmation this is now fixed.

Original comment by r...@wallace.gen.nz on 7 Jul 2011 at 10:56