Closed GoogleCodeExporter closed 8 years ago
I forgot to mention. The patch attached above was produced by me as
work-for-hire by
me for the Software Freedom Law Center. By my authority as CTO of the SFLC,
and with
further approval of our Director, I disclaim any copyright interest in the
patch that
I or the Software Freedom Law Center hold.
Original comment by Bradley.Sif@gmail.com
on 15 Jul 2007 at 4:30
Many thanks for the patch! Before adding anything to pandoc, though, I want to
think
a bit more about what a Strikeout inline element would be used for. I'm
guessing it's
used mainly for tracking deletions to a document. If that's right, the following
concerns come to mind:
- Additions are as important to track as deletions. So if there's an element to
represent text as deleted, shouldn't there also be an element to represent text as
inserted? (HTML has the pair <del> and <ins> for this purpose.)
- The Strikeout inline element would have limitations in tracking deletions.
It could track deleted inline elements. But (a) it would not be able to track
deletions in *parts* of inline elements (such as the title field of a link,
or the text of a Code element, which is represented as a string). And (b) it
would not be able to track deletions of block elements. (Sure, you could
surround all the inline elements in the block with ~~, but that would be
extremely tiresome in, say, a nested list.) Perhaps (b) could be fixed by
adding a Strikeout block element, but this would not help with (a). You
wouldn't be able to strike out one line in a CodeBlock, for example. The best
you could do would be to strike out the whole block.
- All of this makes me wonder whether there's a better solution to change
tracking, one that does not require any changes to pandoc's document structure.
One idea would be to use a diff-like program to compare the HTML versions of
two pandoc documents and insert <ins> and <del> tags accordingly. (Contents
of <del> tags are represented by default as strikeout.) I wrote a little
Wiki using pandoc that does this very nicely (using Data.List.LCS.HuntSzymanski)
on a character-by-character basis (not line-by-line as with diff). It would be
a bit more difficult to do this with LaTeX, because of the way verbatim data is
insulated from the rest, but this would mostly just be a problem with contents of
code blocks.
It would be useful to hear your thoughts about these concerns, and also to hear
a bit more about how you've been using Strikeout.
Original comment by fiddloso...@gmail.com
on 15 Jul 2007 at 3:33
[ BTW, although it's off-topic for this discussion, I wanted to mention to you
at
some point how excellent pandoc is! I would have mentioned it in my first post
had I
not wanted to stay on-topic when creating the ticket. :) ]
You are quite right that strikeout is often used for change-tracking. I agree
with
you that a larger system to help pandoc create change-tracking of documents
would be
extremely useful, and I'd love to see it implemented (and may even be willing
to help
with it, as I have a general need for that too -- but we should probably have
that
discussion on a different forum).
However, I feel that such a feature is a separate issue entirely. Many document
forms (Docbook, RTF, LaTeX, Many Wiki format engines, HTML with the <s> and
<strikeout> tags (albeit deprecated)) allow the user to put in strikeout as a
text
markup, just like italics, underlining, and bold. If pandoc encounters such
markup
in someone's existing document, it should do the right thing with it,
(basically)
regardless of other features pandoc may have to help with change-tracking.
To answer your question about what inspired me to add the feature: I was
originally
drawn to pandoc as a way to easily build S5 slides from Markdown and other
easily-editable formats. I work with lawyers (www.softwarefreedom.org, if you
are
interested), and they give lots of presentations, and I'm trying to keep them
from
using Impress (yuck!). I'm giving them pandoc with its S5 generation ability
and
Markdown as a source format as a way to make easy slides.
One of the items they often need in a presentation slide is to show differences
in
legal text from earlier revisions of the same document. The example that
inspired us
to add strikeout were slides that compare the text of GPLv2 and GPLv3. Now, I
grant
you that we're showing markup for change tracking. But, in the case of an
S5-formatted slide generated Markdown, that's not really a fundamental issue
when
producing that document. The fundamental issue is that someone took the
*output* of
some change-tracking system and now they want to display that output in a
reasonable way.
In summary, my reasons for adding it the way I did are two-fold: many formats
have a
native way of representing strikeout anyway, and therefore pandoc will
encounter that
markup in its usual conversion work, and should DTRT when it does. Second,
there are
times when the users will just want to represent in a reasonable way some change
markup from another source, and they may not even have the original two
documents
around to produce proper change markup via this new feature of pandoc you
mention.
Original comment by Bradley.Sif@gmail.com
on 16 Jul 2007 at 5:17
Thanks for the clarification. That makes sense, and I plan to incorporate your
Strikeout changes into pandoc soon. They will probably make it into the 0.4
release
due later this summer. I will have to change the syntax, though, because I was
already planning to use tildes for subscripts (as in H~2~O). I think that a
double
tilde would make sense here:
This text ~~has been deleted~~.
Thanks again for the patch and the comments!
Original comment by fiddloso...@gmail.com
on 19 Jul 2007 at 5:45
Strikeout has been added to pandoc, along with superscripts and subscripts,
as of r778. Thanks again!
Original comment by fiddloso...@gmail.com
on 22 Jul 2007 at 10:05
Original issue reported on code.google.com by
Bradley.Sif@gmail.com
on 15 Jul 2007 at 4:28Attachments: