jgm / pandoc

Universal markup converter
https://pandoc.org
Other
33.95k stars 3.35k forks source link

List of tables, list of equations, list of figures, and list of listings in html, odt, docx... #1965

Open mbacou opened 9 years ago

mbacou commented 9 years ago

List of tables, list of equations, list of figures, and list of listings are needed in all writers, not just LaTeX. These are important elements in any technical or scientific paper and report, as important as citations and bibliographies. Could you clarify whether this feature might be implemented in all common formats (.html, .odt, .docx, etc.)? Thanks!

mpickering commented 9 years ago

I think this would be desirable but without looking at the current implementation I think the work would be quite involved.

mbacou commented 9 years ago

Hi Matt, Right, each writer would probably implement the tables of content quite differently, but might be worth starting with the most common ones like HTML and ODT? Markdown is so powerful when it comes to working with long shared documents, but having to compile these listings by hand is cumbersome (outside of LaTeX that is). HTML might be the easiest of all as there is no concept of page number there, just links to figures/tables in the document.

mb21 commented 5 years ago

Closing in favour of https://github.com/jgm/pandoc/issues/813

jgm commented 5 years ago

I think this should be left open; it is independent of 813.

mb21 commented 5 years ago

@jgm Well... in my mind 813 has morphed into "Add attributes, auto_identifiers to tables and figures and generate listings of them". But yes, maybe it makes sense to keep it separate.

ghost commented 5 years ago

TeX-based typesetting engines have a way of generating lists of tables, figures, equations, contents, etc. Typically it's fairly trivial, such as the following ConTeXt example:

\starttext
  \input cover

  \startfrontmatter
    \completecontent[alternative=c]
  \stopfrontmatter

  \startbodymatter
    \input body
  \stopbodymatter
\stoptext

The \completecontent[alternative=c] inserts the table of contents, \completelistoffigures would insert a list of figures, etc. OpenOffice also has a mechanism for separate indexes based on tagged document content.

Primarily, this would benefit HTML, which lacks a simple listings mechanism derived from the document structure.

countofsanfrancisco commented 4 years ago

I would like this enhancement too. Has this issue made any forward progress?

For a markdown text conversion to HTML, I think if the markdown does something like this:

<p id="table-some-name">My Table Caption 1</p>
<p id="table-another-name">My Other Table Caption 2</p>
<p id="figure-a-certain-img">A Figure Caption</p>
<p id="figure-another-img">Another Figure Image</p>

Pandoc can key off the text "table-" and "figure-" to differentiate between a figure and a table and auto generate a list of tables or list of figures from a markdown text and using the enclosing text as the caption for that table. For figures, you can use the MD syntax too.

For me, I don't need the figures or tables to be numbered.

isaactpetersen commented 3 years ago

I vote for this feature, as well. It would be nice to be able to output lists of tables and figures to html and other formats.

acxz commented 1 month ago

Related issue for docx: https://github.com/jgm/pandoc/issues/8245

bpj commented 1 month ago

The filter pandoc-crossref does LOT, LOF and list of codeblocks/listings and more for LaTeX and HTML-like formats. For LaTeX it mostly unloads to LaTeX but it provides syntax for captions, labels and crossrefs which Pandoc Markdown (mostly) lacks.

List of tables and list of figures should be relatively straightforward to do with a Lua filter for formats where the list need only contain links and not page/chapter numbers or the like since those elements have captions already. It would be relatively straightforward to first walk the document top down and collect the captions, and possibly insert numbering into the caption, then do a second pass looking for divs with appropriate ids like #list-of-tables and generate and insert the lists in those divs.

acxz commented 1 month ago

Here is a pandoc-crossref related issue: https://github.com/lierdakil/pandoc-crossref/issues/299 regarding docx lot, lof.

bpj commented 1 month ago

Come to think of it would docx generate those lists automatically if you insert the right snippet of XML somewhere? Maybe that can be done with a template now?

iandol commented 1 month ago

Come to think of it would docx generate those lists automatically if you insert the right snippet of XML somewhere? Maybe that can be done with a template now?

https://github.com/quarto-dev/quarto-cli/discussions/2464#discussioncomment-10120690

But @acxz just opened #10029 to add lof and lot to the docx writer, so perhaps no need to make use of the cool new templates for this...