ECToo / xdelta

Automatically exported from code.google.com/p/xdelta
0 stars 0 forks source link

Unicode support #89

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Would be great to have unicode support in xdelta. 

There are different levels of support. 

For example, the bare minimum is to be able to pass in a Unicode file path
to the executable and have it read the file (currently can't do this on
Windows).

I guess more fuller support would allow embedding unicode strings in the
headers, but I suspect this isn't a high priority.

I had a go at trying to get this working on Windows. I came up with a hack
that will allow you to pass in Unicode file paths to the command line tool. 

The way I did it was:

 1. Renamed the existing "main" to "___main"
 1. Created the new Unicode main "wmain(int, wchar_t**)"
 2. Converted all the args from "wmain" into UTF-8 strings using the Win32
API WideCharToMultiByte
 3. Passed those strings into the old "___main"
 4. Then, in main_file_open, I use the Unicode version of CreateFile
(CreateFileW), first converting the UTF-8 strings into wide strings.

What this means is that for ASCII filenames, the behaviour remains the
same: these get turned into UTF-8, but this is a no-op (since UTF-8 ==
ASCII if you're only using ASCII chars).

Note that other arguments are also unaffected (because these will be ASCII
strings and so will look exactly the same after turning into UTF-8).

For Unicode filenames, the open file function will pass those back to
Windows in the wide format and so you can open the file even with unicode
filenames.

Caveats:

 * Suggested patch only works with Windows (though one can follow the
example for other platforms I'm sure)
 * Not tested...

Original issue reported on code.google.com by jdmw...@gmail.com on 16 Jun 2009 at 3:49

Attachments:

GoogleCodeExporter commented 9 years ago
FYI: the patch I added won't let you do anything else in Unicode - it just 
supports
the filenames being in unicode. 

E.g., if you try and set the VCDIFF "header" to some Unicode string, it just 
won't
work. Will probably enter a load of garbage.

Original comment by jdmw...@gmail.com on 16 Jun 2009 at 4:25

GoogleCodeExporter commented 9 years ago
I guess I need to study the use of wchar and any portability implications here.

Original comment by josh.mac...@gmail.com on 9 Jan 2010 at 1:30