Open GoogleCodeExporter opened 9 years ago
Vitaly, is there an xml sample with the corresponding proto definition that you
can provide to reproduce this
case?
Original comment by aant...@gmail.com
on 23 Feb 2010 at 5:20
I will provide this ASAP. You can simply reproduce this if you have String
value in
protobuf object containing '&' or '<' for example. In result XML you'll get the
symbol unchanged but it should be replaced with & or < to make the XMl valid
Original comment by Vitaly.R...@gmail.com
on 23 Feb 2010 at 6:52
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
[deleted comment]
My apologize I attached wrong file. I will attach right one tomorrow. there
are 3
fixes there:
- regex for TOKEN (completely changed)
- escaping XML entities
- un-escaping XML enitites
Original comment by Vitaly.R...@gmail.com
on 10 Mar 2010 at 9:04
[deleted comment]
Here is the code.
Original comment by Vitaly.R...@gmail.com
on 11 Mar 2010 at 2:52
Attachments:
This last file works great and solves several issues. Thanks.
I believe that there is still a problem with unicode characters which should
escape to
&#{codepoint}
Original comment by amoffet@gmail.com
on 11 Mar 2010 at 6:40
what is the problem with unicode characters you mean? I tested last attached
file
with russian characters and everything seems worked fine
Original comment by Vitaly.R...@gmail.com
on 11 Mar 2010 at 9:33
if the source protocol buffer has one or more unicode characters such as
\u20013 - as I
understand it - it should be escaped to 中. For regular unicode such as you
are
describing, things work well. And, I should mention, that it is escaped into
octal
sequences and unescaped from there. However, the XML standards suggest that is
an
unusual tact - and it should instead look as I mentioned. Thanks for your work
on
this.
Original comment by amoffet@gmail.com
on 11 Mar 2010 at 11:11
I haven't read the XML spec so i cannot comment on the last point. But what I
can say is that XmlFormat.java in v.1.1.1 (r43) fails to merge special chars
like German umlauts (ä,ö,ü,ß) and even fails on simple things like dot (.),
single (') or double quotes (") within a message's string property. This is
fatal!
Applying Vitaly's patch made my tests work. No problems so far. As I said, I
don't know if it is perfect now, but at least it doesn't fail on such basic
things.
Here is a patch file based on r43 (v.1.1.1) which includes Vitalys changes.
Maybe this could find its way into the next version.
Original comment by stephan....@gmail.com
on 4 Apr 2011 at 2:31
Attachments:
In case anyone is interested, I needed an in-memory DOM of the XML for my
project, so I rewrote XmlFormat using Dom4j, which correctly handles character
escaping and other XML standards. Source & binaries can be found here:
http://code.google.com/p/protobuf-xml-format-for-java/
Cheers,
Yegor
Original comment by Yegor.Jb...@gmail.com
on 4 Apr 2011 at 5:06
Sounds good even though I would prefer a single stable project for various
formats.
Do you have any benchmarks of your XmlFormat compared to the original?
Original comment by stephan....@gmail.com
on 4 Apr 2011 at 7:02
I am pretty sure Dom4j adds some overhead in both CPU and memory, however I
haven't done any benchmarking. One thing to keep in mind is that the dom4j
version will first create a complete DOM structure in memory and then generate
a full XML string. There is no streaming API, like in the original version.
The reason I decided to keep it separate from this project instead of proposing
a patch is because Dom4j would be quite a big dependency and everyone has their
own favorite XML toolkit.
Of course, I wouldn't mind if it were included in this project, as long as the
maintainer is ok with it.
Original comment by Yegor.Jb...@gmail.com
on 4 Apr 2011 at 8:04
Alright, thank you for your answer. We'll see how things work out in the next
version.
Original comment by stephan....@gmail.com
on 4 Apr 2011 at 8:10
Yegor / Stephan, any of you like to join as commiters for XmlFormatter?
Original comment by eliran.bivas
on 3 May 2011 at 1:34
Original comment by eliran.bivas
on 3 May 2011 at 1:36
Original comment by eliran.bivas
on 3 May 2011 at 1:37
Hi, Eliran,
I'd be happy to help. How do I sign up?
Yegor
Original comment by Yegor.Jb...@gmail.com
on 4 May 2011 at 4:00
Here's an extract from the mail I just wrote to Eliran. Maybe someone else will
find the attached files helpful.
---
hi eliran,
thank you for asking. if you want me to join, i certainly will. however, i
cannot promise that i'll find enough time to contribute regularly, if ever...
but i'll try my best.
... [snipped]
apart from the problems i mentioned on your board, i found some more stuff,
that didn't work as expected. i fixed the issues one by one until the json and
xml format fit our needs and satisfied our test cases with special chars,
extensions, nested types etc.. apart from just fixing bugs i also changed the
code (structure) itself - sometimes because there was no other way of achieving
what i wanted (no extension possible due to static classes), sometimes because
of inconsistencies and redundancies (so i introduced an abstract base class).
the real problem is, that i didn't write down all the things i changed simply
because it was a fluent process and the outcome not clear when i started. that
was really stupid because now there is hardly no chance to merge the stuff back
to your project. however, my plan is to release the ground work of the
messaging framework i wrote as an open source project as soon as i find the
time to. and now the million dollar question i wanted to ask you:
may i include the json and xml format classes (see attachment - extracted from
my framework) which are heavily based on your stuff in this open source
messaging framework? if not, i'm afraid an open source release would not make
much sense. due to the tight coupling of the two classes with the rest of my
framework an external dependency to your project wouldn't make much sense
either. so i really hope you allow me to include them.
of course, if you (or whoever) find them helpful, you can do with them whatever
you want. they're provided as is. maybe they'll even find their way into
protobuf-java-format.
... [snipped]
---
stephan
Original comment by stephan....@gmail.com
on 5 May 2011 at 5:00
Attachments:
Original issue reported on code.google.com by
Vitaly.R...@gmail.com
on 18 Jan 2010 at 8:10