demydd / pandoc

Automatically exported from code.google.com/p/pandoc
0 stars 0 forks source link

Add an OpenDocument writer #61

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The attached patch adds an OpenDocument writer to the pandoc library.

At the present time the patch adds library support for generating an
OpenDocument file (content.xml) with all the automatic styles
generated. But the you will have to compress it in a zip archive with
an appropriate styles.xml (with the default named styles) in order to
open it in OpenOffice.

You can find a sample style.xml here:
http://gorgias.mine.nu/pandoc/

After applying the patch you can run:
pandoc -f markdown -t opendocument -s tests/testsuite.txt > content.xml

To visualize the result with OpenOffice, unzip test.odt (you can find it in
the above mentioned URL), replace "content.xml" and then zip it again with:
zip -9 -r test.odt *

This file
http://gorgias.mine.nu/pandoc/test.odt
is the test suite produced by the writer.

The patch is still work in progress.

Know issue:
- it duplicates all XML generating code from Writer.Docbook. I need
directions with that: should I export form the needed functions from that
file, move them to Shared, or create a new file?

This parch passes the testsuite.txt and the table test.

Cheers,
Andrea Rossato

ps: I'll post updates here
pps: it would be so dreadfully great to have a darcs repository to send
patches to!!

Original issue reported on code.google.com by andrea.rossato@gmail.com on 10 Mar 2008 at 6:49

Attachments:

GoogleCodeExporter commented 8 years ago
Thanks!  Let me think about the XML generating code issue.
Do you use git, by the way?  My main pandoc repository is git; I use git-svn
to sync the main googlecode svn repository with it.

Original comment by fiddloso...@gmail.com on 11 Mar 2008 at 3:12

GoogleCodeExporter commented 8 years ago
> Do you use git, by the way?

No, but I can run tailor to convert it to darcs or give git a try. If your git
repository is available from an http server that would be great.

Thanks
Andrea

Original comment by andrea.rossato@gmail.com on 11 Mar 2008 at 1:46

GoogleCodeExporter commented 8 years ago
I'm attaching a tar archive, referenceStyles.tar.gz.

In it there are all the files needed to generate a .odt archive from the output 
of
pandoc (content.xml). The included content.xml is the testesuite, and test.odt 
is the
generated .odt file.

Known issues (which do not seem to be blocking issue to me):

1. the default bullet for bulleted lists is '*'. This is generated by the 
writer, and
we should set a different bullet for each level. To be done in OpenDocument.hs.

2. preformatted paragraphs do not work well when inserted in block elements: 
see the
orange definition in a definition list, see the code in footnotes;

3. some kind of nested inline elements would require specific styles: see "this 
is
strong and emph", or "this is strikeout".

The rest seems fine to me. But I'm waiting for bug reports.

Original comment by andrea.rossato@gmail.com on 15 Mar 2008 at 11:44

Attachments:

GoogleCodeExporter commented 8 years ago
I've added the reference styles (as directory 'odt-styles') to opendoc branch of
the git repo. So future changes to the styles can take place in the repository.
Note that reference.odt is generated when needed by the Makefile.

I've also pushed changes that add a markdown2odt script.  It seems to work
well, but probably needs more testing.

Some prelimary observations (based on README.odt):
- the definition list bodies no longer seem to be indented.
- spaces seem to have been converted to tabs in some of the code blocks,
  leading to improper alignment of the table examples.

Original comment by fiddloso...@gmail.com on 15 Mar 2008 at 8:22

GoogleCodeExporter commented 8 years ago
Update:  I was wrong in thinking that there was a problem with indentation in 
the
definition lists.

Original comment by fiddloso...@gmail.com on 19 Mar 2008 at 6:40

GoogleCodeExporter commented 8 years ago
When using the -S flag, text enclosed in single quotes is transformed into the
'quotation' character style (shows up as italics in NeoOffice for me). But it 
should
instead transform the quotes into smart quotes, right?

Original comment by dsan...@gmail.com on 19 Mar 2008 at 7:10

GoogleCodeExporter commented 8 years ago
> When using the -S flag, text enclosed in single quotes is transformed into the
> 'quotation' character style (shows up as italics in NeoOffice for me). But it 
should
> instead transform the quotes into smart quotes, right?

I think you are right, but I need some more info: I see this opt(ion)Smart 
commented
this way:

"Use smart typography"

What does this mean? If I find out then implementing it should be quite
straightforward...;)

thanks

Original comment by andrea.rossato@gmail.com on 21 Mar 2008 at 7:30

GoogleCodeExporter commented 8 years ago
Fixed in r1261.

(Andrea:  I made my changes consistent with your style, but you may want to 
check
over the changes.  Smart typography means that regular ascii "quotes", 
apostrophes,
dashes (--), and ellipses (...) are converted to curly quotes, long dashes, and
proper ellipses.  This is mostly handled in the reader, which puts quoted text 
in a
Quoted inline element if smart typography is selected.  The writers handle this
differently:  for example, OpenDocument needs to use unicode curly quote 
characters,
while LaTeX needs `` and ''.)

Original comment by fiddloso...@gmail.com on 21 Mar 2008 at 3:49

GoogleCodeExporter commented 8 years ago
Additional minor issues:

- we should implement the --toc option.  Most of the code can be borrowed from
  the RTF writer.

- currently there's a blank space left when author, title, or date are null.
  Nothing should be printed in this case.

- we need writer.opendocument and tables.opendocument in tests/.  This can be
  done once we're satisfied with the writer's behavior.

Original comment by fiddloso...@gmail.com on 25 Mar 2008 at 3:04

GoogleCodeExporter commented 8 years ago
Is it possible to have the Author and Date from a % title block show up in a 
style
other than "Text Body?"

Original comment by deeay...@gmail.com on 2 Jul 2008 at 10:12

GoogleCodeExporter commented 8 years ago
Summary of outstanding issues (as of r1301):

A. problem with nested block quotes (as in testsuite.txt).  Second para not 
indented.

B. currently there is no distinction btw tight and loose lists.  This is because
there is no distinction between Plain and Para.  Para elements should have a 
bit of
extra space before and after.  Tight lists will also need a bit of extra space 
after
the whole list.

C. bullet lists (perhaps only when nested inside numbered lists?) are not 
indented
properly.  See testsuite.txt.

D. the same space is allocated for all enumerators, leading to a crowded look 
with
wide enumerators (e.g. roman numerals).

E. nested text styles are not handled properly, e.g. strong inside emph.

F. is there a way to include images in the document rather than links to images?

G. author and date should appear in a style other than Text Body.

H. test suite should be added after these fixes are made.

Original comment by fiddloso...@gmail.com on 11 Jul 2008 at 4:39

GoogleCodeExporter commented 8 years ago

Original comment by fiddloso...@gmail.com on 11 Jul 2008 at 4:41

GoogleCodeExporter commented 8 years ago
- It would be desirable to have a little cell padding in the tables.

- Also, for some reason the headers in left-aligned table columns are centered, 
not
left-aligned.

Original comment by fiddloso...@gmail.com on 11 Jul 2008 at 7:10

GoogleCodeExporter commented 8 years ago
A., C. and E. are done. B. requires a style for definition lists, but other 
lists are
now handled properly.

I'll take care of D and G.

F.: the problem is due to the fact that a file name must be generated and the 
file
icluded in the zip archive. Now, the writer does not operate in the IO monad. 
How do
you handle these situations in other part of the code? 

Original comment by andrea.rossato@gmail.com on 14 Jul 2008 at 4:35

GoogleCodeExporter commented 8 years ago
All these issues have now been dealt with as of r1378.
I'm closing this issue!

Original comment by fiddloso...@gmail.com on 5 Aug 2008 at 10:51