quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.77k stars 309 forks source link

Include Table from Separate Markdown File - Similar to Figures #639

Closed alping closed 2 years ago

alping commented 2 years ago

It would be great to be able to include tables from separate files (Markdown and eventually even CSV) in a similar way to figures. It almost works using the include lua filter, but the table is not recognized as a table to be cross-referenced, see example here:

::: {#tbl-included}
```{.include}
tbl/sample-table-1.md

Attempted Included Table. This is an attempt to include a table. It almost works as the heading is placed in the correct place, but it is not recognized as a table and can't be referenced. :::


I usually output Markdown/CSV tables from scripts which allows me to not have to manually change the manuscript every time the numbers change.

Possibly related to issue https://github.com/quarto-dev/quarto-cli/discussions/628.

Thank you for your work,
\- Peter

---

<details>
<summary>Discussed in https://github.com/quarto-dev/quarto-cli/discussions/609</summary>
<div type='discussions-op-text'>

<sup>Originally posted by **alping** April 11, 2022</sup>
Hi!

I do medical/epidemiological research and I've been using markdown and Pandoc for a long time to write my manuscripts. I've been keeping an eye on RMarkdown, Sweave, Knitr, Pweave and similar projects, and was very excited to recently discover Quarto. It's a fantastic project and I'm happy to see that it's being actively developed (thank you!). I'm trying out Quarto for writing my current manuscript and want to contribute some feedback that I hope can make Quarto even better.

I've put together a sample project which demonstrates most of the things I discuss here:
- https://gitlab.com/alping/quarto-test
- https://alping.gitlab.io/quarto-test/

### Enhancements

1. It would be great to be able to include tables from separate files (markdown or csv) in the same way we can do for figures. It almost works using the include lua filter, but the table is not recognized to be cross-referenced, see sample repo. (I usually output MD/CSV tables from scripts which allows me to not have to manually change them every time the numbers change).
2. Specify capitalization of Figure/figure and Table/table on a per reference basis, e.g. `@Fig-main` > Figure 1.1 and `@fig-main` > figure 1.1. Or `@fig-main` would give the default, `@^fig-main` > Figure 1.1, and `@_fig-main` > figure 1.1, or something similar.
3. A way to specify Supplementary Figures/Tables instead of a "hacky" solution involving chapters and `crossref: chapters: true` (see sample repo), so that we can get `Supplementary Figure 1` (with it's own counter separate from the main figures/tables). (This is used a lot in my field).
4. Table/Figure headers separate from table/figure text, possibly in bold font. E.g:
    > **Figure 1: Survival Function**
    > Survival function for entire population using...
5. Variables in inline text. E.g. `We identified {pop_size} patients...` > "We identified 3026 patients...". To avoid manually having to change these every time the numbers change. Variables could be read from a YAML or JSON file.
6. Make some parts only visible for specific render formats:
::: {.docx}
This only shows up in .docx renders.
:::
```
  1. Make HTML preview remember the scroll location on reload. Now every time the page is changed and updated it jumps to the top, which makes it difficult to work on a specific section.

Issues

  1. Refs positioning not working for HTML, but for .docx. In HTML, references always show up at the end of the document.
    ::: {#refs}
    :::
  2. No top margin for H1 in HTML render (weird spacing for chapters)
  3. Printing HTML output causes cut text in page breaks in many cases, similar to what is shown here: https://stackoverflow.com/questions/55990473/htmlcss-page-breaks-cutting-text-characters-in-half-media-print

Again, fantastic project and thank you for your work. - Peter

jjallaire commented 2 years ago

The trick here is you need to have the include filter run before the Quarto filters (by default user filters run after). You can see an example of this here: https://quarto.org/docs/authoring/shortcodes-and-filters.html#using-filters

We'll be working on a native include feature for Quarto soon which won't require any special deviations to work as you are expecting here.

alping commented 2 years ago

As you can see in the example project I do this (hopefully correctly), but it still doesn't work: https://gitlab.com/alping/quarto-test/-/blob/main/_quarto.yml

It seems like Quarto recognizes the table since it moves the caption to above it, but then it fails to create the necessary parts for cross-referencing. I don't know how to inspect the output between the include-files.lua and quarto steps, but I think that would help diagnosing the issue.

_quarto.yml:

project:
  type: website
  output-dir: public

filters:
  - include-files.lua
  - quarto
jjallaire commented 2 years ago

I see the problem now. Tables get their caption and id from within the markdown. So you need to do this in your included table:

| Element | Value |
| ------- | ----- |
| Fire    | 12    |
| Grass   | 98    |
| Water   | 55    |

: Included {#tbl-included}

This does make tables a bit different than figures (as figures support being denoted with a div id).

alping commented 2 years ago

Unfortunately, it's not practical for me to include the description in the MD file and I would prefer to be able to separate the data from the text. But since the include filter is basically a copy/paste of the subfiles (I think?), shouldn't this work? (it doesn't)

```{.include}
tbl/sample-table-1.md

: Attempted Included Table. This is an attempt to include a table. {#tbl-included}

alping commented 2 years ago

I see from the documentation that the div syntax works for subtables:

::: {#tbl-example}
| Col1 | Col2 | Col3 |
|------|------|------|
| A    | B    | C    |
| E    | F    | G    |
| A    | G    | G    |

: First Table {#tbl-first}

Main Caption
:::

I feel like it would be very useful to leverage the div system for a completely flexible way of creating cross-references. In this way anything in the div could be referenced as any of the supported cross-reference types, e.g. want an image to be an equation? No problem:

::: {#eq-sample-equation}
![](fig/equation-figure.png)
:::

Want to use a figure to display a table? Why not:

::: {#tbl-main}
![](fig/table-main.png)

**Sample Table.** This is a sample table.
:::

Want to include something from another markdown file (equation, table, whatever)? Absolutely:

::: {#tbl-included}
```{.include}
tbl/sample-table-1.md

Sample Table. This is a sample table. :::


This way we could still have all the shorthands:

````markdown
![Elephant](elephant.png){#fig-elephant}

| Col1 | Col2 | Col3 |
|------|------|------|
| A    | B    | C    |

: My Caption {#tbl-letters}

$$
\frac{\partial \mathrm C}{\partial \mathrm t}
$$ {#eq-black-scholes}

```{#lst-customers .sql lst-cap="Customers Query"}
SELECT * FROM Customers

While allowing greater flexibility, somewhat more verbosely:

```markdown
::: {#<prefix>-<label>}
<entity to be referenced>

Description of referenced entity.
:::

Even better would be to have the possibility to add an optional title in addition to the description, as we already can with theorems:

::: {#<prefix>-<label>}
<entity to be referenced>

# Title text here (optional)
Longer description here. (optional)
:::

This could then be further improved to allow for sub-figures/tables/equations/etc.

jjallaire commented 2 years ago

It's not quite a copy/paste of text -- the include is implemented during markdown parsing so has already been parsed/processed by the time the caption shows up. That said, we are going to soon re-implement it to be fully text based (so that includes of computational blocks are possible) after which point your example will work.

Some of the constraints on the content of reference-able content come directly from LaTeX (which doesn't let you call just anything a table or an equation). We could allow this and then say it doesn't work for LaTeX but then your documents wouldn't be portable. In spite of this I do think your proposal has appeal for its consistency and something we'll consider doing in the future.

jjallaire commented 2 years ago

I thought of another way to handle this. If you are willing to treat your included tables as computational outputs, then the following would work for Python/Jupyter:

```{python}
#| label: tbl-included
#| tbl-cap: Included
from IPython.display import Markdown
from pathlib import Path
Markdown(Path("tbl/sample-table-1.md").read_text())

While this would work for R/Knitr:
#| label: tbl-included
#| tbl-cap: Included
knitr::asis_output(paste(readLines("tbl/sample-table-1.md"), collapse = "\n"))
alping commented 2 years ago

I see, I'm looking forward to the new implementation of inclusions then!

I guess it's unavoidable to break portability, at least a little, when moving beyond the most basic functionality. This is already the case when latex is used directly in the markdown, e.g. \newpage not working when exporting to .docx. With that said, portability should of course be the goal and if not possible then anything not compatible with the chosen render format should "fail" gracefully. I think https://github.com/quarto-dev/quarto-cli/issues/646 could be useful to deal portability issues.

(By the way, is there a way to get page breaks in Word documents?)

Thanks for this workaround! I'll try it out next week.

jjallaire commented 2 years ago

R Markdown includes support for using LaTeX \newpage and \pagebreak. For compatibility we include this filter in Quarto only when using the Knitr engine (because we aren't sure whether we want to use LaTeX commands as a generalized way to provide portable enhancements). So if you are using the Knitr engine this works today. If not then you could just crib our filter from here: https://github.com/quarto-dev/quarto-cli/blob/main/src/resources/filters/rmarkdown/pagebreak.lua

More on using filters with Quarto: https://quarto.org/docs/authoring/shortcodes-and-filters.html#filters

cderv commented 2 years ago

Just to add about using knitr to include an external md file as content into the main document, it could be done with any specific R code this way using the asisengine and file chunk option from knitr

```{asis, file='table.md'}
#| label: tbl-included
#| tbl-cap: Included


`engine: knitr` is required in YAML header, if not other R chunk is in the document to help [quarto detection](https://quarto.org/docs/computations/execution-options.html#engine-binding)