Change TableStyle to handle different column and row separators

muesli4 / table-layout

Layout data in grids and pretty tables. Provides a lot of tools to get the cell formatting right (positional alignment, alignment on specific characters and limiting of cell width)..

BSD 3-Clause "New" or "Revised" License

37 stars 11 forks source link

Change TableStyle to handle different column and row separators #27

Closed Xitian9 closed 2 years ago

Xitian9 commented 2 years ago

This is a proof of concept for how different row/column separators could be handled. Let me know your thoughts.

Xitian9 commented 2 years ago

In this PR, there's currently a bit of conceptual confusion in the role of NoneHS within HeaderSpec, partly as a result of adapting from the previous definition. Here's the definition in this PR:

data HeaderSpec sep a
    = GroupHS sep [HeaderSpec sep a]
    | HeaderHS HeaderColSpec a
    | NoneHS sep
  deriving (Functor, Foldable, Traversable)

If used as as a top-level field (i.e. not embedded in a GroupHS), thenNoneHS means to create a GroupHS field with the given separator, and one HeaderHS inside for each column (or row group) in the data. It therefore fills the role of a default header. However, it could be embedded inside a GroupHS, in which case its meaning is uncertain. In the current implementation it is basically ignored in these positions, I think.

With this behaviour, it might be reasonable to remove NoneHS as a constructor, and instead use Maybe (HeaderSpec sep a) to generate headers, with Nothing fulfulling the role that NoneHS currently fills. In this way, NoneHS cannot be embedded in a GroupHS.

However, there's a possible future role that NoneHS may play in the richer HeaderSpec proposal: to indicate that a particular column should not have a header, even though other columns may have one. Here's an example:

 Two columns with one header | A
------------+----------------+-------
 Col 1      | Col 2          | Col 3

This might be represented as

GroupHS SingleLine Nothing
  [ LabelledGroupHS SingleLine spec "Two columns with one header" [ NoneHS, NoneHS ]
  , HeaderHS spec "A"
  ]

But even with this, we may want to use a Maybe HeaderSpec, so we don't have to assign a special meaning to a top-level NoneHS.

muesli4 commented 2 years ago

In this PR, there's currently a bit of conceptual confusion in the role of NoneHS within HeaderSpec, partly as a result of adapting from the previous definition. Here's the definition in this PR:
data HeaderSpec sep a
    = GroupHS sep [HeaderSpec sep a]
    | HeaderHS HeaderColSpec a
    | NoneHS sep
  deriving (Functor, Foldable, Traversable)

What does the separator mean in NoneHS? How many columns would it affect? Perhaps, this type should be split:

data Shape sep a
     -- | Hierarchical definition of a header.
    = MergedColumnsS sep [MergedColumn sep a]
    -- | No header and only default separators.
    | NoneS
    -- | Enable specifying separators for the columns.
    | SeparatorsS [sep]

Then in MergedColumn you can group columns together to form super columns with or without a header:

data MergedColumn sep a
    -- | A label for one or more columns.
    = HeaderMC Int a
    -- | A super column that may contain several columns.  An optional label is put above.  The columns
    -- are separated by the given separator.
    | ConcatMC (Maybe a) sep [MergedColumn sep a]

To not have any label would probably only make sense if you want something like this:

.-------------------------.
|           Foo           |
|        .----------------|
|        |       Bar      |
|========|================|

Otherwise, you can just pick an empty label.

 Two columns with one header | A
------------+----------------+-------
 Col 1      | Col 2          | Col 3

Then you could represent it as:

MergedColumnsS SingleLine [HeaderMC 2 "Two columns with one header", HeaderMC "A"]

It would be good to know all possible headers one wants to display. I'm not sure whether I have a complete overview so please do as you think is best and we can then decide later. (My code examples are just rough sketches. Do not feel forced to use them.)

Xitian9 commented 2 years ago

In this PR, there's currently a bit of conceptual confusion in the role of NoneHS within HeaderSpec, partly as a result of adapting from the previous definition. Here's the definition in this PR:
data HeaderSpec sep a
    = GroupHS sep [HeaderSpec sep a]
    | HeaderHS HeaderColSpec a
    | NoneHS sep
  deriving (Functor, Foldable, Traversable)
What does the separator mean in NoneHS? How many columns would it affect?

The separator determines the vertical line to draw between columns. It would be drawn between every column, and affect them all. NoneHS is a bit of a weird one, as its true meaning is ‘the shape of the header is a flat list of the same length as the data’. This is the only header which requires knowing the shape of the data to construct, and so NoneHS is not so much a shape in itself as an instruction to defer until we know the data.

Perhaps, this type should be split:

data Shape sep a
     -- | Hierarchical definition of a header.
    = MergedColumnsS sep [MergedColumn sep a]
    -- | No header and only default separators.
    | NoneS
    -- | Enable specifying separators for the columns.
    | SeparatorsS [sep]

This is reasonable, and functionally similar to wrapping it in a Maybe, but with more power. The problem I see with not supplying a separator to NoneS is that now we can only ever use HeaderSpec when there is a default instance for sep.

Then in MergedColumn you can group columns together to form super columns with or without a header:

data MergedColumn sep a
    -- | A label for one or more columns.
    = HeaderMC Int a
    -- | A super column that may contain several columns.  An optional label is put above.  The columns
    -- are separated by the given separator.
    | ConcatMC (Maybe a) sep [MergedColumn sep a]

That sounds quite reasonable. Let me think on it a bit to see if there are any issues.

To not have any label would probably only make sense if you want something like this:
.-------------------------.
|           Foo           |
|        .----------------|
|        |       Bar      |
|========|================|
Otherwise, you can just pick an empty label.
 Two columns with one header | A
------------+----------------+-------
 Col 1      | Col 2          | Col 3
Then you could represent it as:
MergedColumnsS SingleLine [HeaderMC 2 "Two columns with one header", HeaderMC "A"]
It would be good to know all possible headers one wants to display. I'm not sure whether I have a complete overview so please do as you think is best and we can then decide later. (My code examples are just rough sketches. Do not feel forced to use them.)

Xitian9 commented 2 years ago

I tend towards giving the most flexibility since we don't know all the end use cases people may have. Here some examples of charts I need.

       |          All the columns
       +-----------------------------+-------
       | Two columns with one header | Col
-------+------------+----------------+-------
 Row 1 | Col 1      | Col 2          | Col 3
 Row 2 | Col 1      | Col 2          | Col 3
-------+------------+----------------+-------
 Row 3 | Col 1      | Col 2          | Col 3
=======+============+================+=======
 Row 4 | Col 1      | Col 2          | Col 3

I don't have a specific need for headers like this, but they look like something that others might need:

.-------------------------.
|           Foo           |
|        .----------------|
|        |       Bar      |
|========|================|

I think a full tree-structure row grouping might too awkward to actually display, but to support transposing tables it might be desirable.

            |       |          All the columns
            |       +-----------------------------+-------
            |       | Two columns with one header | A
------------+-------+------------+----------------+-------
 Row header | Row 1 | Col 1      | Col 2          | Col 3
            | Row 2 | Col 1      | Col 2          | Col 3
            +-------+------------+----------------+-------
            | Row 3 | Col 1      | Col 2          | Col 3
============+=======+============+================+=======
 Another rh | Row 4 | Col 1      | Col 2          | Col 3

muesli4 commented 2 years ago

The problem I see with not supplying a separator to NoneS is that now we can only ever use HeaderSpec when there is a default instance for sep.

What would it mean then if we give it a separator? Which one would the user use? Can we not provide noneH with a specific type? Maybe I don't understand the issue.

I think a full tree-structure row grouping might too awkward to actually display, but to support transposing tables it might be desirable.

It depends. I was originally thinking of also rendering the text vertically.

I will have a look at the changed commits.

Xitian9 commented 2 years ago

The problem I see with not supplying a separator to NoneS is that now we can only ever use HeaderSpec when there is a default instance for sep.

What would it mean then if we give it a separator? Which one would the user use? Can we not provide noneH with a specific type? Maybe I don't understand the issue.

NoneHS means the same thing as it did previously: do not display the header, and just display the columns as a flat list. The separator used between those columns is the separator provided. For example:

NoneHS SingleLine:
 Col 1      | Col 2          | Col 3

NoneHS NoLine:
 Col 1       Col 2           Col 3

NoneHS DoubleLine:
 Col 1    ║   Col 2         ║ Col 3

Xitian9 commented 2 years ago

The problem I see with not supplying a separator to NoneS is that now we can only ever use HeaderSpec when there is a default instance for sep.

What would it mean then if we give it a separator? Which one would the user use? Can we not provide noneH with a specific type? Maybe I don't understand the issue.

NoneHS SingleLine:
 Col 1 | Col 2 | Col 3

NoneHS NoLine:
 Col 1  Col 2  Col 3

NoneHS DoubleLine:
 Col 1 ║ Col 2 ║ Col 3

muesli4 commented 2 years ago

The problem I see with not supplying a separator to NoneS is that now we can only ever use HeaderSpec when there is a default instance for sep.

What would it mean then if we give it a separator? Which one would the user use? Can we not provide noneH with a specific type? Maybe I don't understand the issue.

NoneHS means the same thing as it did previously: do not display the header, and just display the columns as a flat list. The separator used between those columns is the separator provided. For example:
NoneHS SingleLine:
 Col 1 | Col 2 | Col 3
NoneHS NoLine:
 Col 1  Col 2  Col 3
NoneHS DoubleLine:
 Col 1 ║ Col 2 ║ Col 3

I see, my bad. But why would this be needed at all? If it is not specified it should use the one provided by the style. Would the style not supply the type of separators?

Also, I misunderstood some of the things you did with the pull request. From your last comments I got the impression that you wanted to implement the headers as a tree structure. I want to avoid feature creep and start getting working things in. If possible, split the branch and create a separate merge request. Then we could get, for example, the unicode style stuff in and it's a lot easier for me to review smaller independent commits.

Xitian9 commented 2 years ago

I see, my bad. But why would this be needed at all? If it is not specified it should use the one provided by the style. Would the style not supply the type of separators?

The TableStyle provides the style to use for each label, but it does not specify exactly which label to use. I suppose the existing styling would be represented by TableStyle () (), where there exists only a single label (), and the style says how to render that one label. But more generally there will be several labels you could use, and you need to specify which one you want.

Maybe the best solution is to define noneH :: HeaderSpec LineType String as noneH = NoneHS SingleLine, and introduce a new constructor noneSepH :: sep -> HeaderSep sep String as NoneHS. That will hide the complexity from the old constructors, but still expose the ability to control the separators.

Also, I misunderstood some of the things you did with the pull request. From your last comments I got the impression that you wanted to implement the headers as a tree structure. I want to avoid feature creep and start getting working things in. If possible, split the branch and create a separate merge request. Then we could get, for example, the unicode style stuff in and it's a lot easier for me to review smaller independent commits.

Agreed. The tree-style structure is desired eventually but not yet implemented, for exactly the reason you cite. The current PR just allows different column separators to be used.

Xitian9 commented 2 years ago

Let me know if there are other issues to address.

Xitian9 commented 2 years ago

Not a problem. I've added more documentation, cleaned up the commit messages, and reduced some of the duplication you flagged. Let me know if you'd like anything else changed.