jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.03k stars 3.35k forks source link

Markdown checkboxes #3051

Closed graymalkin closed 5 years ago

graymalkin commented 8 years ago

It would be nice if Pandoc's GFM supported checkboxes, either through an extension or native to the GFM.

 - [ ] example unchecked
 - [x] example checked

I hope I'm not just being thick here, I couldn't find anything about it in the manual, and no one appears to have mentioned it in the issue tracker.

humanfactors commented 8 years ago

I agree, this would be an excellent enhancement. I'm pretty sure it would parse GFM through pdflatex right? There are plenty of simple and elegant solutions to creating a checkbox in LaTeX.

alok commented 7 years ago

I'm also interested in seeing this added to the Pandoc Markdown syntax. It's a common and rather obvious bit of markup, and with it, I think Pandoc Markdown becomes a strict superset of every common Markdown variant.

dimus commented 7 years ago

I would vote on this as well, great feature to have

ickc commented 7 years ago

with it, I think Pandoc Markdown becomes a strict superset of every common Markdown variant.

Not opposing the idea of supporting this feature, but this statement is far from true.


An important question to ask is how it can be done. It seems it's an "AST Change"-level of difficulty. If indeed it does, I guess it won't be implemented for a long time.

I personally also like to use markdown to manage todo list. For example, the taskpaper style syntax @todo does similar thing (but in pandoc @ has special meaning), which has been used in a markdown variant.

graymalkin commented 7 years ago

It seems reasonable that github flavoured markdown should implement GitHub markdown. (Checkboxes aren't part of daringfireball's original markdown, iirc - it's a GitHub extension)

ickc commented 7 years ago

Yes, it should, but again, how?

As far as I understand: pandoc's unique feature is to implement different markdown extensions seperately, such that when different extensions turned on/off, those combinations of extensions becomes another markdown variant. But the premise is pandoc already have that extension.

When pandoc don't have such extenson, then you need to ask, what it takes to add this extension? Is it a document model pandoc already support? e.g. in the case of multimarkdown inline footnote, pandoc do not support it, but pandoc's internal surely can handle footnote (inline or not), so in this case adding such extension only requires a change in the markdown parser.

But in this case, if I am not mistaken, it requires an AST change. Because pandoc's internal just can't handle that. If I'm correct, this put the required change into the most difficult category: "AST change". It requires a change in all existing writers and readers. And if you see the graph in , it certainly is many.

Now, among all these "AST change" level of feature request (see a list of them by clicking GitHub's label), how many of them are older? or more important? e.g. 1 of the important feature request that involve AST change is column/row span in tables. Now, to me this is something much more important and general to have.

So what I'm suggesting is not it should not happen, I want it happens too. But I'm saying given the level of difficulties, the workloads of the core developers, the amount of open issues, the priorities they might have, this is unlikely to happen in the foreseeable future.

1 thing people often mistaken by those markdown variants in pandoc is that it is supposed to be fully compatible (say, markdown_github, markdown_mmd, etc.), and whenever it falls short of that, people kind of think it should be added because pandoc said it supported it. But it is not. I know that the manual has mentioned its limitation, but perhaps not emphasized enough. So I guess the most urgent "bug fix" is to emphasize what a markdown variants provided by pandoc means. Note that it is not useless even if not fully compatible. But it is only useful when the limitation is understood.

Again, some of the points above based on the assumption that it requires AST change. Feel free to correct me if I'm wrong.

graymalkin commented 7 years ago

Ah, sorry I misunderstood your previous comment.

I don't know how the AST in Pandoc works, I'm here ignorantly opening bug reports with a "would-be-nice-if" flavour. Naively pecking through the code base it looks like the only way to propagate the checkboxes will be a new AST node type.

Is there a node type available for "fallback"? So you can specify 2 different ways to represent a given part of the AST, in the case of Checkbox some Markdown.Checkbox type as well as Teletype with value "[ ]" or "[x]"?

This sort of thing might give developers more freedom to add features in a way which doesn't trample on every single backend. It also encourages a bad behaviour of adding features which aren't going to be visible in every backend, but perhaps that's okay.

ickc commented 7 years ago

As far as I understand, it seems a fallback would not work. I've suggested a similar approach for column/row spans in tables, but they say it won't work. So unfortunately any AST change will be a very daunting task: at least all writers and readers and pandoc-type needed to be changed (sometimes involves more things, say, pandoc-citeproc, templates, etc.)

I think the core developers have been thinking about AST changes. I don't know much about it, but if I were to make such a big change, I would want to do it correctly the second time and incluides as many features which is useful and requires AST changes as possible (so that there's no third time), which only makes the task more daunting.

However, another unique feature in pandoc is its filter system. So I suggest if it is something you sorely needed right now, you should write a filter to do it. How it should be done depends on your need, e.g. is it write to or read from GFM, is pandoc markdown only an intermediate format you need (e.g. you want to gfm -> PDF)? If you are interested in writing a filter or need help on that, you can open a thread on pandoc-discuss, lots of experts there can give you advices and some might even write one for you (don't count on that, however).

jgm commented 7 years ago

Yes, you can write a filter that finds list items of the form

[Plain (Str "[":Space:Str "]":xs)]

and replace these with e.g.

[Plain (RawInline "html" htmlCheckbox : xs)]

This should work fine for HTML output. RawBlock and RawInline are your "fallback."

graymalkin commented 7 years ago

I've written some pandoc filters in the past to do something similar, and I can't say I'm desparate. Like I say it's a "would-be-nice-if".

If it's going to involve such heavy re-work I'm happy to leave this as WontFix, and use a filter. Thanks for your input :)

tajmone commented 7 years ago

I think that implement GFM task lists is important. Here I bring a real case scenario of the problems that this lacking feature can cause.

If I task lists in a markdown document, likes this:

- [ ] Mercury
- [x] Venus
- [x] Earth (Orbit/Moon)

and then I use pandoc to clean up the markdown source:

pandoc -f markdown_github -t markdown_github

the file gets cleaned up, except for the Task List, which gets corrupted by escaping the brackets:

-   \[ \] Mercury
-   \[x\] Venus
-   \[x\] Earth (Orbit/Moon)

That's a pity. Pandoc is a great tool for cleaning up markdown source files (especially with --smart --wrap=none --normalize options): you get properly aligned tables, a standard syntax (where multiple syntaxes are possible), normalization of extra whitespaces, etc. — all of which is not only good for the eye, but also in Git controlled projects, because it reduces diffing nightmares and false positives in status changes.

But right now, this can't be used on GFM docs which make use of Tasks List — else they break up. In many GitHub projects I use batch scripts to clean up all markdown files via pandoc (from GFM to GFM) before commiting. This REALLY helps: I work with "lazy" markdown syntax, but after cleanup all files are up to pandoc standard (eg: I work with Atx-style header, but commit with Setext-style headers, ecc.); but most of all, it makes a much cleaner diffing when merging in contributions and solving conflicts.

Then I have to choose: either I don't use task lists in markdown docs, or I don't use scripts automation to clean up source files.

Tasks Lists being part of the GFM standard, they ought be implemented in pandoc's markdown_github.

ickc commented 7 years ago

@tajmone

Also see this thread in pandoc-discuss. @jgm has specifically said pandoc is not designed as a linter. So we are on our own when we push pandoc beyond what it is designed for.

And please read my comments in this thread. It is likely you don't understand what markdown_github means, the philosophy behind pandoc, and the level of difficulties involved in supporting this feature.

jgm commented 7 years ago

I agree, it would be good to support this somehow. One option that wouldn't require an AST change would be to parse

- [x] foo

as

[BulletList
 [[Plain [Span ("",["checkbox checked"],[]) [Str "[",Space,Str "]"],
   Space,Str "m"]]]]

That would give decent output in all writers, and when writing markdown_github we could special-case it and not escape these brackets.

This wouldn't give you nice-looking checkboxes in PDF or HTML output, though. For that, we'd need either a bunch of fairly ugly special-case code in the writers, or an AST change allowing us to represent a list with arbitrary markers.

tajmone commented 7 years ago

@ickc

Thanks. I am aware of the AST problems and complexities. Nonetheless, I wanted to put forth this particular usage case.

So, it seems that the only solution for now would be to create a filter that preserves [ ] and [x] when they are the first three chars at the beginning of a list element. But couldn't this be implement outside the AST, by having pandoc simply leave them verbatim on the text leaf when working with markdown_github format?

Unfortunately I have no knowledge of Haskell, so I can't contribute much on this issue. But I could look into creating a filter.

But I did look into pandoc sources, to inspect the AST structure. From what I gather, a checklist is just a list subtype -- like a roman letters is just a subtype of an ordered list. Couldn't the AST accomodate some extra attribute to specify that checkboxes are unordered/bullet list items with an extra checkbox qualifier (with an on/off boolean status). After all, in GFM - [ ] becomes a checkbox which substitutes the original bullet. This approach would mean that checkbox items will convert to normal bullet items during convertion to formats which don't support them, but it would allow at least to preserve them in conversion from and to GFM.

@jgm has specifically said pandoc is not designed as a linter. So we are on our own when we push pandoc beyond what it is designed for.

That's a pity though. Pandoc does a good job at cleaning up documents (because of the AST). Maybe in future editions it could have a special --cleanup option to implicitly carry out a -from -to sameformat operation on the input file.

After all, people look for pandoc because they like the idea of having a standalone single binary (ok, + citepro) tool to handle formats conversion. But if we need to install Node.js, or Python or Ruby just to access a linter than its benefits tend to dilute down (an possible, you end up installing a different linter for each format, with dozens of dependencies).

jgm commented 7 years ago

+++ Tristano Ajmone [Dec 14 16 02:47 ]:

[1]@ickc

Thanks. I am aware of the AST problems and complexities. Nonetheless, I wanted to put forth this particular usage case.

So, it seems that the only solution for now would be to create a filter that preserves [ ] and [x] when they are the first three chars at the beginning of a list element. But couldn't this be implement outside the AST, by having pandoc simply leave them verbatim on the text leaf when working with markdown_github format?

Yes, we could special-case this in the markdown writer. There's always some chance it would lead to a conflict, e.g. if you had a link reference definition

[x]: foo

then an unescaped

- [x] bar

would become a link.

Unfortunately I have no knowledge of Haskell, so I can't contribute much on this issue. But I could look into creating a filter.

All that would be needed for your purposes would be to identify

Str "[",Space,Str "]"

at the beginning of a list item and replace it with

RawInline (Format "markdown") "[ ]"

which won't be escaped.

But I did look into pandoc sources, to inspect the AST structure. From what I gather, a checklist is just a list subtype -- like a roman letters is just a subtype of an ordered list. Couldn't the AST accomodate some extra attribute to specify that checkboxes are unordered/bullet list items with an extra checkbox qualifier (with an on/off boolean status).

Yes, it's definitely possible to add a new type of list to the AST. But it's a huge change since you then have to change virtually every module in pandoc to deal with the new element.

ickc commented 7 years ago

Yes, we could special-case this in the markdown writer. @jgm

It seems unnatural to special-case this as such a case (GFM checklist) would only happens when it is both from and to markdown_github, outside this markdown variant, it doesn't mean anything (or, no meaning has been assigned yet).

But I could look into creating a filter. @tajmone

If the only thing you need is to change \[x\] and \[ \] back to [x] and [ ], a post-processor might be simpler, as long as there aren't such pairs which doesn't mean a checklist in your document. A filter is definitely more strict and worry-free.

That's a pity though. Pandoc does a good job at cleaning up documents (because of the AST). Maybe in future editions it could have a special --cleanup option to implicitly carry out a -from -to sameformat operation on the input file. @tajmone

I've made a similar suggestion before. But the problem of using pandoc as some sort of "linter" is 2-fold:

  1. configurable styling
  2. being "idempotent", i.e. after it is linted, further linting would not change the document further.

The first one is optional but nice to have as a linter. It can already be done partially by +/- markdown extensions. And this will probably never be the goal of pandoc.

The second one is more critical, but is currently not true. It is very hard to achieve this, and @jgm has mentioned this is the area he wants to improve (but cannot guarantee). The reason it is important is it guarantees the output captures what the AST represents. i.e. its importance is not only for being a linter but any reader/writer pairs in general.

See more on this topic in How to programmatically enforcing a pandoc markdown style - Google Groups. (I clicked the link I referred to this in the last post, but the link is wrong. This is not the first time I have problem posting a link to a certain post to pandoc-discuss, probably related to the mobile version of Google Groups. If the link doesn't work, search the topic there and you'll find it.)

After all, people look for pandoc because they like the idea of having a standalone single binary (ok, + citepro) tool to handle formats conversion. But if we need to install Node.js, or Python or Ruby just to access a linter than its benefits tend to dilute down (an possible, you end up installing a different linter for each format, with dozens of dependencies). @tajmone

  1. Although pandoc isn't designed as a linter, but it can't stop us from trying to use it like that. For example, I use pandoc to "define" a markdown variants I like (by using +/- extensions, filters, pre-post processors, etc.) and uses that to lint some of my md. This cannot be done by any other linter since it is a "custom markdown variant".

  2. On the other hand, if there exists a linter designed for the markdown variant you specifically use, you should almost definitely use that since that will be more reliable (since it is designed as so rather than pushing beyond what it is designed for).

I agree, it would be good to support this somehow. One option that wouldn't require an AST change would be to parse

- [x] foo

as

[BulletList
 [[Plain [Span ("",["checkbox checked"],[]) [Str "[",Space,Str "]"],
   Space,Str "m"]]]]

@jgm

This approach is interesting, since it circumvent the need of "AST change". On one hand, it feels unnatural. But on the other hand, if it is functionally equivalent to an "AST change" without an "AST change", might be we shouldn't care too much about being "syntactically correct".

Just to bring this out explicitly, the example at the beginning of this thread is rendered by GitHub as

<ul class="contains-task-list">
<li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" disabled=""> example unchecked</li>
<li class="task-list-item"><input type="checkbox" class="task-list-item-checkbox" checked="" disabled=""> example checked</li>
</ul>
tarleb commented 7 years ago

It seems unnatural to special-case this as such a case (GFM checklist) would only happens when it is both from and to markdown_github, outside this markdown variant, it doesn't mean anything (or, no meaning has been assigned yet).

Not sure what you mean here. Org-mode has a similar feature, and checkboxes can be output in HTML and probably in docx and odf. There's a textile issue about implementing a feature like this, too.

tajmone commented 7 years ago

Lots of cool suggestions here! Hopefully this feature might be first implemented via some filters or custom readers and writers, to test the grounds with different approaches ...

What could be a way to represent checkboxes in non-html formats? I remember coming across various solutions in doc/pdf, like using some common dingbats (of the sort you should find on all OSs’).

I've found an MS Office Help article suggesting use of Wingdings font.

Unicode symbols could be a more universal approach, provided the font being used contains the glyphs (I think there is a fallback mechanism for missing glyphs, resorting to use default fonts). On Wikipedia I've found some unicode symbols that might do the job:

Code Character Symbol
U+237B NOT CHECK MARK
U+2611 BALLOT BOX WITH CHECK
U+2705 WHITE HEAVY CHECK MARK
U+2713 CHECK MARK
U+2714 HEAVY CHECK MARK
U+2610 BALLOT BOX (checkbox)
U+2612 BALLOT BOX WITH X (square with cross)
U+2717 BALLOT X (cross)
U+2718 HEAVY BALLOT X (bold cross)

The choice is between using a pair of checkbox symbols (ticked box / empty box) or check marks (tick / cross). The latter is often confusing.

Here in Italy we use both systems, and the checkboxes can be interpreted differently, depending on whether or not there is a distinction between check- and X-marks:

(3-states approach)
 [v] = yes           |   [x] = no   |   [ ] = irrelevant

or:

(boolean approach)
 [x]  or  [v] = yes  |   [ ] = no

I like the GFM checkbox because it is clearly a yes/no binary choice. But maybe for formats other than html there might be some other standard ways in place, which I am not aware of.

ickc commented 7 years ago

Not sure what you mean here. Org-mode has a similar feature, and checkboxes can be output in HTML and probably in docx and odf. There's a textile issue about implementing a feature like this, too. @tarleb

If you put the quote in the context, I guess @jgm means special-casing that paricular combination in markdown writer (while leaving AST, any other writer untouched). That discussion is independent of implementing the whole feature of checklist, which in that case is no longer "special-case".

cdornan commented 7 years ago

I would also like this feature and am having to work around their absence. GFM task lists have become really quite prevalent. I understand the difficulties of adding them but it is surely a matter of time before they get added?

craigforr commented 7 years ago

+1 vote for GFM checkboxes support when exported as markdown_github.

As for presentation, I would vote for the simple <input type="checkbox" checked disabled> that GitHub itself renders, or Ballot Box and Ballot Box with Check if Unicode characters are used.

Bullets with hyphens/dashes:

Bullets with asterisks/stars:

And the case of the X chars used to check the box is irrelevant when GitHub renders them.

ec1oud commented 7 years ago

I'm voting for this too.

ickc commented 7 years ago

Sorry for being the bad guy, but if one want to vote, check the emoji on the right of each message. The difference is that won't notify people and causes spams.

e.g. you can read more about this in Reactions to Pull Requests, Issues, and Comments · Issue #141 · dear-github/dear-github.

In some repo, thread like this will be locked very soon (but not pandoc because developers here are nice). It is not that developers don't see a value in this issue (see @jgm's comment above for example), but it is difficult to handle it properly (and if one want a hackish approach, suggestions has already been made above).

ec1oud commented 7 years ago

Sorry.

I actually wanted to start by just showing some formatting (including checkboxes) in a terminal-based viewer. One that I can use as a less filter without needing to invoke a web browser just for that. There are hacks where less invokes pandoc to generate man output which is then processed by groff and then viewed with man; or, have it generate html and then view with w3m or lynx. But it's practically plain text with just a little formatting, usually, so that all seems like overkill to me. I'm kindof surprised that when working with github repos that typically contain markdown files, nobody has a better way to just view the markdown nicely. Or edit it in anything other than a plain text-editor, or some ridiculous webkit/javascript mashup.

So I started throwing one together in go for now: https://github.com/ec1oud/mdcat (and using my fork of blackfriday https://github.com/ec1oud/blackfriday ) mainly because the blackfriday parser seemed like a good starting point, and because I've been curious about go. (Probably Haskell is better, but I haven't gotten around to climbing that learning curve yet.)

At some point hopefully the world will stop calling this feature something from "github markdown" and expect it to be part of markdown itself. A de-facto extension, or even part of the standard. IMO it's one of the most useful extensions of all, and it's also easy to implement.

I think pandoc should also have an output mode for ANSI terminal codes (to style some text spans, like headings and emphasized phrases) plus unicode (for checkboxes, bullets, fractions, "smartypants" quotes and ellipses etc., block quote indentation bars, and box drawing around tables). Then it could be used directly as a filter for less.

ickc commented 7 years ago

IMO it's one of the most useful extensions of all, and it's also easy to implement.

If I understand you correctly that you mean it is easy to implement GitHub checklist in pandoc, then my point all along is that it is actually not. Try to follow the discussion above.

P.S. I'd consider discussion like this helpful though, unlike the voting message above. And I've been there too, so don't worry.

ec1oud commented 7 years ago

Because you have an AST, it needs to be extended for this. I get it.

ickc commented 7 years ago

Because you have an AST, it needs to be extended for this. I get it.

Not only this: because pandoc is decided to be a "universal document converter", it has so many writers/readers, so a simple change of the AST would mean a tedious task to "support" this in all readers/writers. By "support" it doesn't mean it has to be able to render it, even if you want to leave that element alone, you need to add that to the reader/writer to do so.

However, pandoc is right now going through a transition into pandoc 2.0. This gives the developers some more freedom to do more "breaking change". There's a couple of AST changes pending to be implemented. That said, I'm not sure if this issue would be targeted by pandoc 2.0, because developers time is still a limit.

tajmone commented 7 years ago

I wanted to let everyone know that I've managed to create a workaround for using GFM Task Lists within pandoc documents (only work in markdown and HTML output):

This approach uses pp macros to preprocess the markdown source before piping it to pandoc; the macros produce Raw HTML in the final document. CSS sytling is reccomended but not mandatory (GitHub CSS classes just hide the original bullet on list elements, but the checkboxes will show even without GH stylesheets).

This is how the macros work:

!raw(
!TaskList(
!Task[x][I'm a _checked_ task]
!Task[ ][I'm an _unchecked_ task]
)
)

... producing the following Raw HTML:

<ul class="task-list">
<li class="task-list-item"><input type="checkbox" disabled="" checked="">I’m a <em>checked</em> task</li>
<li class="task-list-item"><input type="checkbox" disabled="">I’m an <em>unchecked</em> task</li>
</ul>

... which then will be previewed as (what follows is the actual Raw HTML pasted, not in GFM):

Other macros are available to extend the power of pandoc, and more are coming soon. You can see a live HTML preview of a custom HTML template mimicking GitHub CSS and using pp-macros for GFM Task Lists and other advanced formatting (like GitHub alerts):

This project was made possible thanks to Christophe Delord's (@CDSoft) kind support and dedication to implementing my new features requests for pp:

https://github.com/CDSoft/pp

After months of testing and fine-tuning (and three pp releases introducing new macros), I'm confident that the pp-macros library project launched yesterday might sparkle inspiration and collaboration to extend pandoc markdown workflow. For example, I've currently added a macro to integrate André Simons' syntax highlighter in the workflow: a simple macro in the document allows to import an external source file as highlighted Raw HTML — and this doesn't interfere with pandoc's own highlighter since pandoc only highlights markdown code blocks, and will ignore the highlighted <pre><code> block.

I've also effectively used macros to implement AsciiDoctor tables from external files, or parse and import any external AsciiDoc snippet via AsciiDoctor.

So I hope to see soon contributors joining the Pandoc-Goodies project, and see the pp-macros library grow. Please don't forget to consider starring and linking the repo.

Best regards

glenndevenish commented 7 years ago

What's the limit for nesting of these? I've run out after 3 spaces. Also, what about semi-fulfilled tasks? (indicated by a square in the checkbox). Maybe [#] if that isn't anything already.

tajmone commented 7 years ago

What's the limit for nesting of these? I've run out after 3 spaces.

I'm not sure what you mean by "3 spaces" ... Anyhow, Task lists should follow the same nesting rules as other lists (bullet and ordered) — ie: depending on the markdown engine you're using, there will be some limits set to the level of nested elements to prevent infinite nestings. But this will vary from a markdown engine to another.

(Since pandoc doesn't currently support Task Lists, I'm assuming your question was a general question on markdown)

Also, what about semi-fulfilled tasks? (indicated by a square in the checkbox). Maybe [#] if that isn't anything already.

As far as my knowledge goes, Task Lists are specific to GitHub flavored markdown — which doesn't allow semi-fulfilled tasks. But you could implement this feature quite easily using custom PP macros (see my previous comment), and use Task Lists in pandoc markdown sources too.

As for nesting Task Lists via PP macros, you'll have to create an extra macro that will take care of emitting the required HTML tags for nesting a list within another ... It should be quite simple, but its markdown source might not look as neat as a simple task list — probably something like this:

!TaskList(
   !Task[x][I'm a _checked_ task]
   !Task[ ][I'm an _unchecked_ task]
   !NestedTaskList(
     !Task[x][I'm a nested task]
     !Task[x][I'm a nested task]
   )
)
tarleb commented 6 years ago

We've added a filter task-list to the pandoc/lua-filters repo. It renders task lists in a style similar to that used by GitHub. It works with GitHub-flavored Markdown and org-mode task lists; in fact, any input format can be used as long as the same [ ]/[x] syntax is used for checkboxes. If the output format is neither HTML, gfm, nor org-mode, then the checked and unchecked boxes will be rendered as ☑ and ☐, respectively.

tatoosh commented 6 years ago

I don't understand how to render HTML Checkboxes from [x] or [ ]. Do i need to include a filter?

glenndevenish commented 6 years ago

@tatoosh you do it like this:

- [x] Foo: dash, space, [, x, ]
- [ ] Bar: dash, space, [, space, ]
ghost commented 6 years ago

If this is fixed, can this be closed? Are there remaining items?

jgm commented 6 years ago

It's supported with a filter, not natively. If no one cares about native support, we could close this. But probably we should consider adding native support. (We could emulate what the filter does.)

lollipopman commented 6 years ago

I would love native support!

tajmone commented 6 years ago

But probably we should consider adding native support. (We could emulate what the filter does.)

It would be definitely a good idea since GFM is a supported format (and a widely used one too).

reagle commented 6 years ago

I'd like native support.

ickc commented 6 years ago

@reagle just spammed everyone in this thread without adding anything new. Or do you think people in this thread don’t want to see native support? Stop doing this anywhere on GitHub.

reagle commented 6 years ago

Sorry, I generally don't but I read "If no one cares about native support, we could close this" as a question. I should've upvoted the existing response.

nichtich commented 5 years ago

I'd favor native support because of the popularity of this feature. Checkboxes are also supported in other applications, e.g. iAWriter names them task lists. Fortunately checboxes can natively be supported with an extension. +smart replaces several character sequences by Unicode characters, so extension +checbboxes could replace [ ] and [x] inside list items. For examples

pandoc -f markdown+checkboxes -t markdown

would transform

 - [ ] example unchecked
 - [x] example checked

into

 - ☐ example unchecked
 - ☒ example checked

and with output format html+checkboxes:

<ul>
<li><input type="checkbox"> example unchecked</li>
<li><input type="checkbox" checked> example checked</li>
</ul>

This requires no change to the AST because checkboxes are internally represented by Unicode characters.

tajmone commented 5 years ago

Fortunately checboxes can natively be supported with an extension [...] so extension +checbboxes could replace [ ] and [x] inside list items.

I really like the idea! it would introduce native support for checkboxes, but at the same time it won't impose them on users.

I also propose that the checkbox feature should become part of the pandoc markdown specification, and the extension be enabled by default for pandoc markdown — unless there a impediments due to some other output formats not being able to represent them.

lollipopman commented 5 years ago

I think this would definitely be better than no support at all, but having a native representation in the AST would allow you to do cool things like use checkbox html elements that would be clickable in the browser.

nichtich commented 5 years ago

@lollipopman I'd expect the html writer to output with extension +checkboxes:

<ul>
<li><input type="checkbox"> example unchecked</li>
<li><input type="checkbox" checked> example checked</li>
</ul>

and with -checkboxes:

<ul>
<li>☐ example unchecked</li>
<li>☒ example checked</li>
</ul>

what more do you want?

lollipopman commented 5 years ago

I was thinking something like this:

<ul>
  <li><input type="checkbox" checked />checked</li>
  <li><input type="checkbox" unchecked />unchecked</li>
</ul>  

2018-11-02-144436_215x69_scrot

tarleb commented 5 years ago

@lollipopman the task-list filter posted above does exactly that. It serves as a stopgap measure until the best way to handle checkboxes has been decided.

mb21 commented 5 years ago

This requires no change to the AST because checkboxes are internally represented by Unicode characters.

That sounds like a great idea! The only output format that has a special way to represent checkboxes is HTML anyway, and for the others the unicode is a perfect fallback – which we get for free in this proposal.

quasicomputational commented 5 years ago

If the HTML writer does some magic for that character, how would you literally have that character in HTML output?

mb21 commented 5 years ago

@quasicomputational Yes, the markdown writer should also be sensitive to the checkboxes option. But I think it's fine to enable by default... alternatively, you could use a filter to transform to raw html or generic raw attributes.

OleMussmann commented 5 years ago

@lollipopman

I was thinking something like this:

<ul>
  <li><input type="checkbox" checked />checked</li>
  <li><input type="checkbox" unchecked />unchecked</li>
</ul>  

2018-11-02-144436_215x69_scrot

As far as I know the Github checkboxes don't have bullets and are disabled. Should be something along these lines, then:

<ul style="list-style-type: none; padding: 0 7px;">
    <li><input type="checkbox" checked disabled> pet kittens </li>
    <li><input type="checkbox" disabled> world domination </li>
</ul>

Some padding might be necessary for indentation.

screenshot from 2018-11-27 15-40-07