ClosestStorm / v8cgi

Automatically exported from code.google.com/p/v8cgi
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Feature request: utf-8 BOM #12

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
File.read method should check for utf-8 byte-order-mark.

At the moment, it just returns raw file content, so when I send it's output
to the browser - the BOM appears in it.

Original issue reported on code.google.com by rand0mbond@gmail.com on 19 Jun 2009 at 9:41

GoogleCodeExporter commented 9 years ago
This will be completely rewritten once the ServerJS IO proposal gets finished 
and
ratified. I believe that current behavior is correct (File().read should have no
knowledge about file's internal data structure), but the ServerJS IO standard 
will
surely contain more reading methods, with some of them specialized for utf-bom 
handling.

Original comment by ondrej.zara on 19 Jun 2009 at 9:45

GoogleCodeExporter commented 9 years ago
By the way, according to http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf, 
"Use
of a BOM is neither required nor recommended for UTF-8, but may be encountered 
in
contexts where UTF-8 data is converted from other encoding forms that use a BOM 
or
where the BOM is used as a UTF-8 signature". I personally try to avoid BOM in 
UTF-8
when possible.

Original comment by ondrej.zara on 19 Jun 2009 at 9:51

GoogleCodeExporter commented 9 years ago
Yeah, that's all exactly correct, but I forgot to mention that the problem 
appears in
the Template module, because it uses File.read. 

It took me some time and it was very non-trivial to figure out why firefox makes
strange offsets in-between templates, I got it only looking at the raw webserver
output :)

And as many text editors and IDEs are forced to write BOM, it is hard to figure 
out
whether there is a BOM or not, and it is definitely not comfortable to remove 
the BOM
for each template file manually :D

But well, let's see what the ServerJS proposal will say... 

Original comment by rand0mbond@gmail.com on 19 Jun 2009 at 12:58

GoogleCodeExporter commented 9 years ago
Exactly.

By the way, why do you think that many editors and IDEs are "force to write 
BOM"?
AFAIK this feature can be turned off in every piece of software I have ever 
used.
(And I can highly recommend you doing the same :) )

Original comment by ondrej.zara on 19 Jun 2009 at 2:01

GoogleCodeExporter commented 9 years ago
Well... yeah, thanks :) 

But still, it is not that good to keep this in mind, this issue should be 
eliminated
automatically anyway

Original comment by rand0mbond@gmail.com on 19 Jun 2009 at 5:37

GoogleCodeExporter commented 9 years ago

Original comment by ondrej.zara on 22 Jun 2009 at 12:21