vsemenov / protobuf-java-format

Automatically exported from code.google.com/p/protobuf-java-format
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

XMLFormat does not handle dates properly #13

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Produce any protobuf that contains a date like 11 Mar 2009 as a value
2. Convert it to XML
3. XMLFormat will throw an exception

What is the expected output? What do you see instead?

I expect to see a date, without any errors
What version of the product are you using? On what operating system?

1.1, java 1.6.0, Windows 7
Please provide any additional information below.

The regular expression (TOKEN) will group a number by itself, before 
grouping a date like 11 March 2009 together.  Subsequently, the tokenizer 
will look for an end tag and not find one.

amoffetATgmailDOTcom

Original issue reported on code.google.com by amoffet@gmail.com on 4 Mar 2010 at 10:12

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
I've also just found that places like Salt Lake City, UT also do not work.

Original comment by amoffet@gmail.com on 4 Mar 2010 at 10:59

GoogleCodeExporter commented 8 years ago
It became necessary to use a different pattern for values.  I've modified 
nextToken 
to look like this:

    public void nextToken(boolean skipWhiteSpace, boolean isValue) {
      previousLine = line;
      previousColumn = column;

      // Advance the line counter to the current position.
      while (pos < matcher.regionStart()) {
        if (text.charAt(pos) == '\n') {
          ++line;
          column = 0;
        }
        else {
          ++column;
        }
        ++pos;
      }

      // Match the next token.
      if (matcher.regionStart() == matcher.regionEnd()) {
        // EOF
        currentToken = "";
      }
      else {
        if (isValue) {
          matcher.usePattern(VALUE_TOKEN);
        }
        else {
          matcher.usePattern(IDENTIFIER_TOKEN);
        }
        if (matcher.lookingAt()) {
          currentToken = matcher.group();
          matcher.region(matcher.end(), matcher.regionEnd());
        }
        else {
          if (isValue) {
            // there is no content in ths element
            currentToken = "";
          }
          else {
            // Take one character.
            currentToken = String.valueOf(text.charAt(pos));
            matcher.region(pos + 1, matcher.regionEnd());
          }
        }

        if (skipWhiteSpace) {
          skipWhitespace();
        }
      }
    }

where VALUE_TOKEN is "[^<>]+";

This is really ugly and isn't a well defined method. I'm leaving it to you guys 
to 
decide how to handle this.  I'd rather refactor this into more specialized 
methods.  
If I have some more time, I'll see what I can do.

amoffetATgmailDOTcom

Original comment by amoffet@gmail.com on 7 Mar 2010 at 4:09

GoogleCodeExporter commented 8 years ago
This appears to be fixed in the code provided in item 10

Original comment by amoffet@gmail.com on 11 Mar 2010 at 6:37

GoogleCodeExporter commented 8 years ago
I was incorrect.  An old maven dependency fooled me.  This is still an issue.

Original comment by amoffet@gmail.com on 11 Mar 2010 at 11:04

GoogleCodeExporter commented 8 years ago

Original comment by eliran.bivas on 3 May 2011 at 1:36