janfri / mini_exiftool

This library is a wrapper for the Exiftool command-line application (https://exiftool.org) written by Phil Harvey. It provides the full power of Exiftool to Ruby: reading and writing of EXIF-data, IPTC-data and XMP-data. Branch master is for actual development and branch compatibility-version is for compatibility with Ruby 1.8 and exiftool versions prior 7.65.
GNU Lesser General Public License v2.1
213 stars 52 forks source link

Weird error with unicode characters in path "Wildcards don't work in the directory specification" #22

Open ccoenen opened 9 years ago

ccoenen commented 9 years ago

This may not be a bug in mini_exiftool, but right now, i don't really know what to make of it. If you could help me pin it down, that would be amazing.

I'm on windows, and i have path names that contain unicode characters. One path looks like this:

C:\tmp\2015-03-23 Test with german umlaut äöü\IMG_1000.JPG

Now if i fire up an irb, i can open that file with ruby, but not with mini_magick

f = File.open("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1000.JPG")
# => #<File:C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1000.jpeg>
f.size
# => 18713

# so far, so good! Let's try mini_exiftool, now.

require 'mini_exiftool'
# => true
m = MiniExiftool.new("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1000.JPG")
# MiniExiftool::Error: Wildcards don't work in the directory specification
# No matching files

#        from C:/Tools/Ruby21/lib/ruby/gems/2.1.0/gems/mini_exiftool-2.5.0/lib/mini_exiftool.rb:137:in `load'
#        from C:/Tools/Ruby21/lib/ruby/gems/2.1.0/gems/mini_exiftool-2.5.0/lib/mini_exiftool.rb:101:in `initialize'
#        from (irb):11:in `new'
#        from (irb):11
#        from C:/Tools/Ruby21/bin/irb:11:in `<main>'

note that this is not a "file not found", because i can easily provoke that:

m = MiniExiftool.new("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1337.JPG")
# MiniExiftool::Error: File 'C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1337.JPG' does not exist.
#        from C:/Tools/Ruby21/lib/ruby/gems/2.1.0/gems/mini_exiftool-2.5.0/lib/mini_exiftool.rb:121:in `load'
#        ...

the same example works fine, if i change the directory name to omit the äöü part:

m = MiniExiftool.new("C:/tmp/2015-03-23 Test without german umlaut/IMG_1000.JPG")
# => #<MiniExiftool:0x3031f50 @opts={:numerical=>false, :composite=>true, ...

The error message (Wildcards don't work in the directory specification) does not come from anywhere within mini_exiftool, at least not that i can find it with github's code search.

Exiftool itself is also not at fault (at least not alone), because i can do this without a problem:

> exiftool.exe "C:\tmp\2015-03-23 Test with german umlaut äöü\IMG_1000.JPG"
ExifTool Version Number         : 9.90
File Name                       : IMG_1000.JPG
...

I'm really somewhat stuck.

ccoenen commented 9 years ago

This does not change if i'm using backslashes instead of forward slashes.

janfri commented 9 years ago

I'm doing a lot to handle encoding and escaping particularly for filenames in mini_exiftool. What is the result of

Encoding.find('filesystem')

on your windows system?

ccoenen commented 9 years ago
Encoding.find('filesystem')
# =><Encoding:Windows-1252>

This is a Windows 7 (x64) machine with this environment:

C:\Users\user>bundler env
Bundler 1.7.12
Ruby 2.1.5 (2014-11-13 patchlevel 273) [i386-mingw32]
Rubygems 2.4.6
janfri commented 9 years ago

This seems to be correct. I have no idea. Maybe a look at the executed command line will be helpful:

$DEBUG = true
m = MiniExiftool.new("C:/tmp/2015-03-23 Test with german umlaut äöü/IMG_1337.JPG")
ccoenen commented 9 years ago

exiftool -j "C:/tmp/2009-03-07 test Path ???/IMG_4224.JPG" - it seems to replace the umlauts with question marks, which are a wildcard on windows (single character).

janfri commented 9 years ago

In which encoding is your source file written? Do you use the correct magic comment? http://en.wikibooks.org/wiki/Ruby_Programming/Encoding#Using_Encodings

ccoenen commented 9 years ago

The examples earlier were from irb, with no encoding set, explicitly.

Here's all of the encoding outputs for reference

Encoding.find('external')
# <Encoding:CP850>
Encoding.find('internal')
# nil
Encoding.find('filesystem')
#<Encoding:Windows-1252>
Encoding.find('locale')
#<Encoding:CP850>

From within the irb i ran the following commands:

Encoding.default_external = 'utf-8'
# "utf-8"
Encoding.default_internal = 'utf-8'
# "utf-8"
require 'mini_exiftool'
# true
m = MiniExiftool.new("C:/tmp/2009-03-07 test Path äöü/IMG_1000.JPG")
# MiniExiftool::Error: Wildcards don't work in the directory specification
# No matching files
# ...

I also put this into a ruby file (and i double checked that it was actually saved as UTF-8)

#encoding: UTF-8
Encoding.default_external = 'utf-8'
Encoding.default_internal = 'utf-8'
require 'mini_exiftool'
m = MiniExiftool.new("C:/tmp/2009-03-07 test Path äöü/IMG_4224.JPG")
puts m

It fails with the same Wildcards-Error-Message.

janfri commented 9 years ago

Could you try (UTF-8 encoded)?

#encoding: UTF-8
puts `exiftool.exe "C:/tmp/2009-03-07 test Path äöü/IMG_1000.JPG"`
ccoenen commented 9 years ago

it can't find the file, but what i find more interesting is, that the encoding is wonky, so maybe it's already broken before it hits exiftool? I ran these lines:

# encoding: UTF-8
# äöü

require 'open3'
Encoding.default_external = 'UTF-8'

paths = [
  "\"C:/tmp/test äöü/201412050001hq.jpg\"",
  "\"C:\\tmp\\test äöü\\201412050001hq.jpg\""
]

paths.each do |path|
  puts "## Run with path: #{path}"

  puts "*backticks*\n"
  out = `exiftool.exe #{path} 2>&1`
  puts '    ' + out
  puts '    ' + out.force_encoding(Encoding.find('filesystem')).encode('UTF-8')

  puts "*popen3*\n"
  stdin, stdout, _ = Open3.popen3("exiftool.exe #{path} 2>&1")
  stdin.close
  out = stdout.read
  puts '    ' + out
  puts '    ' + out.force_encoding(Encoding.find('filesystem')).encode('UTF-8')
end

which produces this output:

## Run with path: "C:/tmp/test äöü/201412050001hq.jpg"
*backticks*
    File not found: C:/tmp/test 巼/201412050001hq.jpg
    File not found: C:/tmp/test äöü/201412050001hq.jpg
*popen3*
    File not found: C:/tmp/test 巼/201412050001hq.jpg
    File not found: C:/tmp/test äöü/201412050001hq.jpg
## Run with path: "C:\tmp\test äöü\201412050001hq.jpg"
*backticks*
    File not found: C:/tmp/test 巼/201412050001hq.jpg
    File not found: C:/tmp/test äöü/201412050001hq.jpg
*popen3*
    File not found: C:/tmp/test 巼/201412050001hq.jpg
    File not found: C:/tmp/test äöü/201412050001hq.jpg

The broken characters may not end up correctly in here, so i also made a screenshot from Notepad++, where broken characters are displayed as hex:

output

ccoenen commented 9 years ago

(just to make sure: i ran the same test on Ruby 2.2.1x64 on windows just now. Same output)

ccoenen commented 9 years ago

This might be interesting: http://www.sno.phy.queensu.ca/~phil/exiftool/exiftool_pod.html#windows_unicode_file_names this has been introduced/changed on 2015-01-04. My tests have been with 9.90, so this might explain some of the encoding weirdness.

ccoenen commented 9 years ago

I tried specifying the -charset FileName=cp1252 (and UTF8, while i was at it), it didn't change the file not found. As long as that does not work in any way, i don't think mini_exiftool is to blame. If i can't get to the file from a simple backtick or popen3, i don't think mini_exiftool can.

How should i continue? Do we close this ticket unresolved (upstream problem somewhere)? Do we leave it open?

ccoenen commented 9 years ago

I don't get it?! I can use umlaut files with multi_exiftool?! What the actual f*ck?! Sorry. I'm going to post an example over there in the next few hours.

ccoenen commented 9 years ago

Here's the change i did, that fixes umlauts and lets all of multi_exiftools tests pass. https://github.com/ccoenen/multi_exiftool/commit/4b836a6d5d67d831e24e94e7276564577ca4e8ad For some reason, though, i can't get it to work in backticks or popen3 (as described above).

ManuelSamudio12 commented 5 years ago

I'm here few years late but I had the same issue, my fix was install ruby and exiftool over Linux and it worked perfectly.

Hope this comment will be helpful.

janfri commented 5 years ago

@ManuelSamudio12 The intention was to get it working under Windows. ;-)

qontolmabur commented 2 years ago

Same here, trying to create context menu with batch file REG ADD "HKCR\*\shell\ExifTool\command" /t REG_SZ /d "\"%systemroot%\system32\cmd.exe\" /K exiftool \"%%L\"" /f works just fine if the file is in standard named folder, but not work if the file is in folders with special charactersm in my case the special character is Δ

qontolmabur commented 2 years ago

but when i cd to the directory contains special character, and then run open command prompt in that folder / cmd > then type exiftool <file> , it works just fine (exiftool shows information about the file)