jgm / djot

A light markup language
https://djot.net
MIT License
1.67k stars 43 forks source link

List tables #27

Open jgm opened 2 years ago

jgm commented 2 years ago

Tables whose cells contain block-level content (multiple paragraphs, lists, code blocks) can't be represented as pipe tables. For these cases we might want to provide "list tables" as in RST. These could be rerpresented as a list in a div with attributes.

::: table aligns="lc" widths="25 50"
- * one
  * two
- -----
- * three
  * ^
- * five

    multi-paragraph
  * ~

Any content below the list is the caption.
~ means: merge with the cell above.
^ means; merge with cell to left.
:::
uvtc commented 2 years ago

This list-like syntax seems like it would be very messy if a cell were to contain a list.

If you can specify the dimensions of the table at the top (number of rows and cols), then it's predetermined how many cells you have, and you can just specify that many items for your list. How about:

::: table rows=3 cols=2 headings=true
Flavor
----------
Description
----------
lemon
----------
Sour but good. You can use this flavor to make:

 * lemon meringue
 * lemon icing
----------
cherry
----------
Tastes like real cherries! I'd suggest using
this one to make

 * cherry ice cream
 * cherry soda
----------
Blueberry
----------
Tastes pretty much like you'd expect.
Use this flavor for:

 * blueberry pie
 * blueberry gum
:::
bpj commented 2 years ago

@jgm I don't like the "table" keyword since it's natural language specific. I'd rather use a specific delimiter like ===.

Also alignments could be shown symbolically with punctuation

=== <|> 50 25 25

Where aligns are symbolized with

My Pandoc filter repeats the rightmost explicit alignment when there are more columns than explicit alignments. That's handy e.g. when you want the leftmost column to be left aligned and the rest to be right aligned.

Are widths supposed to be percentages of the total available width? I think that would be the easiest conceptually and notationally

Perhaps also a # prefix to indicate a <th> cell, e.g. for a stub.

-   # Header cell

BTW my filter accepts numbered lists and produces them by default, giving the header row if any index 0. That is very useful to keep track on where you are in a large table.

@uvtc I have used nested lists for and in tables for a long time with my filter. They are not confusing if you indent properly and use a variety of list markers. Also you can wrap the cell contents in a div to disambiguate further. I think complicating the syntax would be really confusing.

===
1.  1.  Foo
    2.  :::list-in-table
        -   Bar
        -   Baz
        -   Quux
        :::
===
uvtc commented 2 years ago

I think === looks good as a delimiter for a list table. And the parallel lines kind of reminds me of a table.

uvtc commented 2 years ago

They are not confusing if you indent properly and use a variety of list markers.

Yeah. I see what you mean.

Incidentally, I don't like calling these "list tables", since the list item markers aren't exactly working like regular list markers (e.g. they have two in a row). (Maybe "list-like tables"...)

Would this syntax break design goal num 7 (friendly to hard-wrapping)?

matklad commented 1 year ago

Do we need nested lists here? We already make blank lines between list elements meaningful, so we can, asciidoctor-style, use blanks to delimit the rows?

::: table aligns="lc" widths="25 50"
- one
- two

- three
- ^

- five

  multi-paragraph
- ~

Any content below the list is the caption.
~ means: merge with the cell above.
^ means; merge with cell to left.
:::

if we need to delimit the header-row, we can re-use thematic break

bpj commented 1 year ago

What about cells with more than one paragraph, or with content with something else than a paragraph? IMNSHO the whole point of list tables is to allow that, but it won't work if you use blanks to delimit rows.

matklad commented 1 year ago

I think it would?

+ cell with complex content

  more complex content

  - this is still
  - part of this cell
  - due to indentation

+ a new cell, and a new row, due to a preceding blank

  more content
+ a new cell in _this_ row, because there's no blank

That is, blanks between the cells seems unambiguous with blanks within the cells

matklad commented 1 year ago

Thinking about this more, nested lists allow to express N-dimensional structures, while we only need 2D. So a flat sequence which delimits row starts is enough. One way to delimit rows would be blank lines, but we also can use different list marks: * to start a header row, + to start normal row, - to continue a row

* width
- height

+ 640
- 480

+ 800
- 600
jgm commented 1 year ago

This is pretty similar to wikipedia table syntax.

{|
!align="center" width="15%"| Centered Header
!width="13%"| Left Aligned
!align="right" width="16%"| Right Aligned
!width="35%"| Default aligned
|-
|align="center"| First
| row
|align="right"| 12.0
| Example of a row that spans multiple lines.
|-
|align="center"| Second
| row
|align="right"| 5.0
| Here’s another one. Note the blank line between rows.
|}
matklad commented 1 year ago

Inspired by https://github.com/jgm/djot/issues/128#issuecomment-1344556958, what if we do the following for all tables:

Examples

~~~~~~~~~
| 1 | 2 |
~~~~~~~~~

~~~~~~~~~~~~~~
| two | rows |
| 1   | 2    |
~~~~~~~~~~~~~~

Table with header.
I think we need a blank after `|`.
We don't want inline-parser to recursively call block parser,
so we can't see that `---` is a line break

~~~~~~~~~~~~~~~~~~
|fruit   |  price| {% if there's a header, it sets cell aligmet %}

------------------
| apple  |     4 |
| banana |    10 |
~~~~~~~~~~~~~~~~~~

Table with cell-per-line

~~~
|fruit |  price|

---
| apple
| 4

| banana
| 10
~~~

Complex table, unreadable, but possible

~~~
|code |  list|

---

```rust
fn main() {}

|

fn main() void {}

i. A ii. B iii. C



````

This ... probably has all kinds of ambiguities, and requires a fair amount of "pattern matching" from the convertor to infer the structure from syntax, but on the first glance it seems like it is both convenient for short tables and reasonable for large tables. 

I think the main thing we've lost is specifying aligment on per-row basis, but that doesnt' seem crucial. 
matklad commented 1 year ago

I think we need a blank after |. We don't want inline-parser to recursively call block parser, so we can't see that --- is a line break

Or maybe we don't? Maybe we can treat | exactly as \n\n? That is, I thought the way this needs to work is by teaching inline parser to be aware of, |, but it seems we don't actually need that? Though, the obvious drawback would be that inline `foo || bar` code wouldn't do what you'd expect...