jgm / pandoc

Universal markup converter
https://pandoc.org
Other
33.88k stars 3.34k forks source link

pandoc gets freeze parsing <td></td> #299

Closed rmunoz closed 13 years ago

rmunoz commented 13 years ago

Trying to convert html to rst:

<table>
<tbody>
<tr>
<td><strong>Data 1</strong></td>
<td><strong>Data 2</strong></td>
<td><strong>Data 3</strong></td>
<td><strong>Data 4</strong></td>
</tr>
<tr>
<td>aaa</td>
<td>bbb</td>
<td>ccc</td>
<td></td>
</tr>
</tbody>
</table>

pandoc gets freeze at:

<td></td>

Instalation info:

ricardo@bugge:/tmp$ pandoc --version pandoc 1.8.2.1 Compiled with citeproc support. Compiled with syntax highlighting support for: Actionscript, Ada, Alert, Alert_indent, Ansys, Apache, Asn1, Asp, Awk, Bash, Bibtex, Boo, C, Changelog, Cisco, Cmake, Coffeescript, Coldfusion, Commonlisp, Cpp, Cs, Css, Cue, D, Desktop, Diff, Djangotemplate, Doxygen, Doxygenlua, Dtd, Eiffel, Email, Erlang, Fortran, Fsharp, Fstab, Gap, Gdb, Gettext, Gnuassembler, Go, Haskell, Haxe, Html, Idl, Ilerpg, Ini, Java, Javadoc, Javascript, Json, Jsp, Latex, Lex, LiterateHaskell, Lua, M3u, Makefile, Mandoc, Matlab, Maxima, Mediawiki, Metafont, Mips, Modula2, Modula3, Monobasic, Nasm, Noweb, Objectivec, Objectivecpp, Ocaml, Octave, Pango, Pascal, Perl, Php, Pike, Postscript, Prolog, Python, R, Relaxngcompact, Rhtml, Ruby, Scala, Scheme, Sci, Sed, Sgml, Sql, SqlMysql, SqlPostgresql, Tcl, Texinfo, Verilog, Vhdl, Winehq, Wml, Xharbour, Xml, Xorg, Xslt, Xul, Yacc, Yaml Copyright (C) 2006-2011 John MacFarlane Web: http://johnmacfarlane.net/pandoc This is free software; see the source for copying conditions. There is no warranty, not even for merchantability or fitness for a particular purpose.

rmunoz commented 13 years ago

it also frezee with carriage return between <td></td>:

   <tr>
   <td>
   </td>
   </tr>
jgm commented 13 years ago

I couldn't reproduce this with pandoc built from the current repository.

% pandoc -f html
<table>
<tbody>
<tr>
<td><strong>Data 1</strong></td>
<td><strong>Data 2</strong></td>
<td><strong>Data 3</strong></td>
<td><strong>Data 4</strong></td>
</tr>
<tr>
<td>aaa</td>
<td>bbb</td>
<td>ccc</td>
<td></td>
</tr>
</tbody>
</table>
<table>
^D
<tbody>
<tr class="odd">
<td align="left"><strong>Data 1</strong></td>
<td align="left"><strong>Data 2</strong></td>
<td align="left"><strong>Data 3</strong></td>
<td align="left"><strong>Data 4</strong></td>
</tr>
<tr class="even">
<td align="left">aaa</td>
<td align="left">bbb</td>
<td align="left">ccc</td>
<td align="left"></td>
</tr>
</tbody>
</table>

Can you try again with the current version? (Though I don't see any fixes that would have changed things since your bug report.)

rmunoz commented 13 years ago

Ok, it works:

$ /home/ricardo/.cabal/bin/pandoc -f html -t textile test.html 

|*Data 1*|*Data 2*|*Data 3*|*Data 4*|
|aaa|bbb|ccc||

^D
*Data 1*
*Data 2*
*Data 3*
*Data 4*
aaa
bbb
ccc

... but when i try to convert to rst, gets freeze:

$ /home/ricardo/.cabal/bin/pandoc -f html -t rst test.html 

$ ps --pid 21447 u
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
ricardo  21447 98.4  7.0 292536 281900 pts/3   R+   23:04   0:06 /home/ricardo/.cabal/bin/pandoc -f html -t rst test.html

I'm sorry, i should said that im trying to covert a html file to rst format.

jgm commented 13 years ago

Ah, that helps. Here's a more minimal test case:

pandoc -f native -t rst
[Table [] [AlignLeft,AlignLeft] [0.0,0.0]
 []
 [[[Plain [Str "a"]]
  ,[]]
 ,[[Plain [Str "a"]]
  ,[]]]]

The problem is in the RST writer. The table code there is kind of gnarly, and I haven't tracked it down yet.

rmunoz commented 13 years ago

Thanks for your work.