JuliaEarth / geospatial-data-science-with-julia

Geospatial Data Science with Julia
https://juliaearth.github.io/geospatial-data-science-with-julia
86 stars 15 forks source link

Proposal to improve table rendering #14

Closed ronisbr closed 8 months ago

ronisbr commented 8 months ago

Hi!

For reasons I cannot remember (maybe @bkamins can), the headers in DataFrames output are aligned to the left whereas some numeric columns are aligned to the right. This works pretty well in the terminal and in Jupyter. However, the tables in Quarto are post-process to fill the entire space, leading to a bad representation as follows:

Captura de Tela 2023-11-01 às 11 30 12

Take a look how the same table is shown in Jupyter:

Captura de Tela 2023-11-01 às 13 15 06

Hence, my proposal is to change how DataFrames are shown here in the book. My proposal is to do the following:

` ` `{julia}
#| output: false
using DataFrames

df = DataFrame(
  NAME=["John", "Mary", "Paul", "Anne", "Kate"],
  AGE=[34, 12, 23, 39, 28],
  HEIGHT=[1.78, 1.56, 1.70, 1.80, 1.72],
  GENDER=["male", "female", "male", "female", "female"]
` ` `

` ` `{julia}
#| echo: false
show(stdout, MIME("text/html"), df; header_alignment = :c, alignment = :c)
` ` `

Leading to the much nicer version:

Captura de Tela 2023-11-01 às 13 52 37

P.S.: I tried to override Base.show in Quarto to reduce the work, but I couldn't.

juliohm commented 8 months ago

Appreciate the fix @ronisbr ❤️ Could you please submit a PR with the proposed changes?

As we discussed elsewhere, it would be nice to get this fixed in Quarto itself or change the DataFrames.jl defaults to centered alignment.

ronisbr commented 8 months ago

There is a much easier way (but more hacky). Just add to the code section that loads the environment:

using DataFrames

function Base.show(io::IO, mime::MIME"text/html", df::AbstractDataFrame; kwargs...)
    return DataFrames._show(io, mime, df; alignment = :c, header_alignment = :c, kwargs...)
end

function Base.show(io::IO, mime::MIME"text/html", dfrs::DataFrames.DataFrameRows; kwargs...)
    DataFrames._verify_kwargs_for_html(; kwargs...)
    df = DataFrames.parent(dfrs)
    title = "$(nrow(df))×$(ncol(df)) DataFrameRows"

    return DataFrames._show(
        io,
        mime,
        df;
        alignment = :c,
        header_alignment = :c,
        title = title,
        kwargs...
    )
end

function Base.show(io::IO, mime::MIME"text/html", dfcs::DataFrames.DataFrameColumns; kwargs...)
    DataFrames._verify_kwargs_for_html(; kwargs...)
    df = DataFrames.parent(dfcs)
    title = "$(nrow(df))×$(ncol(df)) DataFrameColumns"

    return DataFrames._show(
        io,
        mime,
        df;
        alignment = :c,
        header_alignment = :c,
        title = title,
        kwargs...
    )
end

function Base.show(io::IO, mime::MIME"text/html", dfr::DataFrameRow; kwargs...)
    DataFrames._verify_kwargs_for_html(; kwargs...)
    r, c = parentindices(dfr)
    title = "DataFrameRow ($(length(dfr)) columns)"
    return DataFrames._show(
        io,
        mime,
        view(parent(dfr), [r], c);
        alignment = :c,
        header_alignment = :c,
        rowid = r,
        title = title,
        kwargs...
    )
end

This code overloads all the printing functions in DataFrames to HTML adding the center alignment. This approach should be the easiest one until I can implement themes support in PrettyTables.jl.

ronisbr commented 8 months ago

Could you please submit a PR with the proposed changes?

Sure! But what version do you prefer?

it would be nice to get this fixed in Quarto itself or change the DataFrames.jl defaults to centered alignment.

Quarto v1.4 will have an option to disable table processing globally. Maybe this will fix everything. I think everything will be also fixed if DataFrames change the header alignment to match the column alignment. I remember that we discussed this alignment and for some reason I cannot remember it was decided to always align the header to the left.

juliohm commented 8 months ago

what version do you prefer?

The first one where we change the specific cells with DataFrame output. They only appear in chapter 01 while we introduce Tables.jl. All other tables in the book should be geotables, which are centered already.