shamveelahammed / gource

Automatically exported from code.google.com/p/gource
0 stars 0 forks source link

(Trivially?) transformed svn log causes "unsupported log format" #160

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Generate an svn log
2. Transform with the attached XSLT
3. Run gource on the resulting svn log

Expected result: successful parsing of the log file
Actual result: "unsupported log format" with no further information

Using 0.38 official win32 build, on Windows XP Pro sp3 (via VirtualBox if that 
matters).

The reason I am transforming is that I did a CVS -> SVN migration 
project-by-project (with a global repo), so the svn revision number isn't 
necessarily chronological. I thought that sorting the revisions would not be a 
problem. Here's the entire XSLT:

<?xml version="1.0" ?>
<xsl:stylesheet
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xsd="http://www.w3.org/2001/XMLSchema"
        version="1.0"
        >
  <xsl:template match="/log">
    <xsl:copy>
      <xsl:apply-templates select="logentry">
        <xsl:sort select="date" /> 
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template>

  <!-- Don't mess with the rest of the tree -->
  <xsl:template match="*|@*|text()|comment()|processing-instruction()">
    <xsl:copy>
      <xsl:apply-templates select="*|@*|text()|comment()|processing-instruction()" />
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Original issue reported on code.google.com by schultz....@gmail.com on 27 Jun 2012 at 2:58

GoogleCodeExporter commented 8 years ago
Hi,

It's hard to tell without seeing what the transformed xml looks like. The 
<logentry> open and close tags do need to be found at the very start of a new 
line (only the logentry tags are actually parsed as XML, in the order they are 
encountered).

You could try running the original log through gource with '--output-custom-log 
output.csv' option, and then sort that externally (eg in a spreadsheet 
program). That might be easier.

Cheers

Andrew

Original comment by acaudw...@gmail.com on 27 Jun 2012 at 10:20

GoogleCodeExporter commented 8 years ago
The sorted log file appears to have no optional whitespace remaining. Here's a 
snippet from the top of the file (somewhat anonymized):

<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry revision="1">
<date>2005-07-29T14:37:37.000000Z</date>
<paths>
<path kind="dir" action="A">/project/tags</path>
<path kind="dir" action="A">/project</path>
<path kind="dir" action="A">/project/trunk</path>
<path kind="dir" action="A">/project/branches</path>
</paths>
</logentry>
<logentry revision="2">
<author>schultz</author>
<date>2005-07-29T14:37:37.000000Z</date>
<paths>
<path kind="file" action="A">/project/trunk/src/java/my/great/Type.java</path>
[...]

gource doesn't say where it choked on my file, so I can't look in a specific 
place. Are there any debugging options in the stock win32 build?

Original comment by schultz....@gmail.com on 29 Jun 2012 at 2:18

GoogleCodeExporter commented 8 years ago
There isn't really any detailed parsing info enabled unfortunately.

Gource seems to parse the example you posted. Maybe the issue is with the bits 
that were anonymized.

Original comment by acaudw...@gmail.com on 30 Jun 2012 at 1:34

GoogleCodeExporter commented 8 years ago
Okay, I'll try with a small segment of the file and keep going until it starts 
barfing. Maybe I'll find something that isn't evident at the top of the file.

Any ideas if gource might have issues with non-ASCII descriptions or things 
like that? IIRC, we had a couple of log messages in CVS that had trouble 
converting to UTF-8 (like dumbquotes, etc.) and had to use a fallback-encoding 
option to avoid having to manually re-write individual log entries.

Original comment by schultz....@gmail.com on 30 Jun 2012 at 5:05

GoogleCodeExporter commented 8 years ago
Upon further examination, the snip of the log file I posted had not been sorted 
;)

I'm attaching the first 4k or so of the actual result file, with some 
anonymization. I hopefully gracefully ended the <logentry>, etc. even though it 
was in the middle of the list of files actually committed on that date.

Thanks for looking into this. I have seen other successful animations produced 
by gource... I'd just like to see *my* repo visualized ;)

Original comment by schultz....@gmail.com on 1 Jul 2012 at 12:39

Attachments:

GoogleCodeExporter commented 8 years ago
That seems to be invalid xml (the path tags are missing a >). If you fix that 
it works, so this isn't really getting us anywhere.

I'm guessing it is an encoding problem as you suggest, so maybe re save the log 
in some text editor with a specific encoding.

Original comment by acaudw...@gmail.com on 1 Jul 2012 at 3:03

GoogleCodeExporter commented 8 years ago
It looks like the primary problem was that my tags weren't all separated by 
newlines (i.e. there were some like '</logentry><logentry revision="x">...'). I 
can force newlines in the transform, but it makes the resulting file a lot 
larger due to the extra newlines, so I used a sed mutator in the command-line 
pipeline to remove those. Everything appears to work, now, so it's up to you 
whether you want to call this one WONTFIX or INVALID.

FWIW, it would be nice if well-formed XML could be read regardless of things 
like whitespace.

The video was cool. ;)

Original comment by schultz....@gmail.com on 2 Jul 2012 at 2:31

GoogleCodeExporter commented 8 years ago
Ok no worries. Glad you got it working.

Original comment by acaudw...@gmail.com on 2 Jul 2012 at 7:10