Closed GoogleCodeExporter closed 9 years ago
I don't understand. Why would you
> Change org.h2.DataPage.writeString(String s) to read like so?
Original comment by thomas.t...@gmail.com
on 30 Jul 2009 at 2:46
I was looking to see if it was faster, since I assumed they would be the same
(both
encoding strings to UTF8 bytes).
This is not something a user will encounter. It (like the bug with LOBs)
represents
a reaction of "huh, that's funny, should that happen?" when examining and
working
with the code.
My concern is that it may open the door for vulnerabilities with regard to
UTF-8 and
unusual encodings. Two potential problems: multibyte encodings (longer than the
shortest legal encoding) and specifying invalid UTF-8 chars (things your
routines may
accept, but should not).
UTF-8 problems are fairly widespread and common vulnerabilities -- early JREs
have
issues with this only recently discovered --
http://sunsolve.sun.com/search/document.do?assetkey=1-66-245246-1
If you google "UTF-8 vulnerability" you'll see some a ton of other examples.
Rolling
your own UTF-8 handling (as here) may be the way to go, it just needs to be
checked
for problems from the two sets of routines conflicting.
I think both problems can be checked somewhat in unit tests by generating random
UTF-8 characters and random bytes within specific ranges, and then seeing how
they
are handled. There may already by tests for this, I'd just like to confirm that
there
are no issues with different UTF-8 handlings.
If you give me commit access, I'll add some tests.
Original comment by buckyba...@gmail.com
on 30 Jul 2009 at 3:54
H2 uses it's own storage format. It doesn't matter if it's UTF-8 or not,
because the format is private to H2. I don't see how the data format
of H2 could be a vulnerability.
Please only open bugs for actual issues.
Original comment by thomas.t...@gmail.com
on 30 Jul 2009 at 4:07
Original issue reported on code.google.com by
buckyba...@gmail.com
on 29 Jul 2009 at 4:39