bryonjacob / wikimodel

Automatically exported from code.google.com/p/wikimodel
0 stars 0 forks source link

XHtmlParser - iWemListener.beginTableCell never passed true for tableHead parameter #15

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I parsed the following XML with XhtmlParser:
<table><tbody>
  <tr><th>header col1</th><th>header col2</th></tr>
  <tr><td>col1</td><td>col2</td></tr>
  <tr><td>you         </td><td>can         </td></tr>
  <tr><td>also        </td><td>align<br /> it. </td></tr>
</tbody></table>

When the iWemListener gets called at beginTableCell, it should get passed
true for the boolean for the <th> elements.  This parameter is always false.

Original issue reported on code.google.com by dannylev...@gmail.com on 9 Jan 2008 at 8:22

GoogleCodeExporter commented 8 years ago

Original comment by mikhail....@gmail.com on 10 Jan 2008 at 3:32

GoogleCodeExporter commented 8 years ago
I did some debugging on this problem and found a couple of problems.  First, in 
the
XhtmlHandler in the tag handler used for th and td line 416 reads:

context.getScannerContext().beginTableCell(context.isTag("th"));

In this case context.isTag("th") is always returning false.  It turns out that 
isTag
is checking fLocalName.  If you read the javadoc for the SAX Content handler at:
http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/ContentHandler.html#startEle
ment(java.lang.String,%20java.lang.String,%20java.lang.String,%20org.xml.sax.Att
ributes)
it says that uri and localName parameters are optional if the namespaces 
property is
false.  The default is supposed to be true.  Apparently in my test environment, 
the
property is set to false.  I added the following 2 lines to the body of
XhtmlParser.parse:

   XMLReader xmlReader = parser.getXMLReader();
   xmlReader.setFeature("http://xml.org/sax/features/namespaces", true);

Once I did this, the localName parameter is always provided on the SAX calls.

This gets me to a second problem.  The first th element inside a tr is not being
handled correctly.  If you look at the two beginTableRow methods in
InternalWikiScannerContext, they both call beginTableCell if the row is being
started.  This means that when the th element is handled, the context thinks 
that it
is already inside a table cell, so it gets skipped.  Why are the beginTableRow
methods automatically calling beginTableCell?  Maybe that works when processing 
wiki
markups, but it does not work properly for the html scanner.

Original comment by dannylev...@gmail.com on 10 Jan 2008 at 4:24

GoogleCodeExporter commented 8 years ago
Hi Danny,

Re the first problem I fixed that a few days ago. I've been working on the XHTML
parser for some time now so if you haven't tested it for some time give it a go 
:)

Re the second problem I'll probably end up looking at it at some point in the 
future
but if you can provide a patch that would help a lot.

Thanks

Original comment by vmas...@gmail.com on 22 Nov 2008 at 5:21

GoogleCodeExporter commented 8 years ago
Fixed

Original comment by vmas...@gmail.com on 25 Nov 2008 at 1:23