Open GoogleCodeExporter opened 8 years ago
I think the lack of UTF-8 interpretation is also why the UTF-8 BOM is visible
(shown as "" in the beginning of
the UTF-8 files).
Original comment by tik...@gmail.com
on 25 Jan 2008 at 1:15
In SVN I've added a textEncoding option with default UTF-8. This is less than
perfect, I realize, but it'll at least
allow people to set QLCC to handle the encoding they view most often.
Original comment by n8gray@gmail.com
on 2 Apr 2008 at 9:40
n8gray, great, but WebKit needs to be pointed out to encoding.
Here's the patch to make it work.
Instead of emptydict, we pass a dictionary with
kQLPreviewPropertyTextEncodingNameKey set to default
encoding (or UTF-8 if none).
Works great for me.
Original comment by dch...@gmail.com
on 31 Jul 2008 at 11:02
Attachments:
I committed a fix for this similar to what dchest suggested. I used a
different config variable
"webkitTextEncoding" because I'm not sure that webkit and highlight recognize
the same text encoding strings.
Let me know if you're happy with the result (once I release it in a day or so).
Original comment by n8gray@gmail.com
on 7 Jan 2009 at 10:32
Perhaps some code from the file(1) command would be helpful?
http://www.opensource.apple.com/darwinsource/10.5.6/file-23/file/src/ascmagic.c
File(1) makes a good attempt to identify text as ASCII, UTF-8, UTF-16,
ISO-8859/latin1, extended ASCII, and (International) EBCDIC.
Another solution might be to exploit functionality in CoreServices's Text
Encoding
Manager, which apprently includes an encoding sniffer:
http://developer.apple.com/documentation/Carbon/reference/Text_Encodin_sion_Mana
ger/Reference/reference.html
Original comment by adfergu...@gmail.com
on 7 May 2009 at 2:05
To remove the UTF-8 BOM you can invoke highlight using the --validate-input
switch.
This will also disable parsing of binary stuff.
Original comment by andre.si...@gmail.com
on 26 Oct 2009 at 8:41
I've added the --validate-input switch in git. Thanks Andre!
Original comment by n8gray@gmail.com
on 28 Oct 2009 at 6:09
<code>/usr/bin/file</code> ships with MacOSX; no need to rip out anything. It's
trivial to use it to detect the file encoding: the output from <code>file
--mime-encoding -b $FILENAME</code> is the sought content encoding. This is a
little highlight-to-utf8 shell script I wrapped up, that pipes the file through
GNU recode to turn any text file into highlighted UTF-8:
<code>#! /bin/zsh
file="$1"
shift
ext=$(echo $file(:e))
enc=$(file --mime-encoding -b "$file")
recode "$enc"..utf8 < $file | highlight -S "$ext" "$@"
</code>
(You can pass through options like -A to make ANSI instead of HTML output, if
you're running it from a shell window.)
Original comment by oyas...@gmail.com
on 7 Aug 2010 at 7:56
Original issue reported on code.google.com by
n8gray@gmail.com
on 7 Jan 2008 at 10:28