shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
999 stars 84 forks source link

csvtk pretty option to wrap text within a column #206

Closed vkkodali closed 1 year ago

vkkodali commented 1 year ago

I think this option does not currently exist so this is likely a feature request. The -W option with csvtk pretty is quite useful when dealing with tables that have some really wide fields. However, I imagine it's not unusual for everyone to come across tables with (somewhat) wide fields for which we would like to see the data. Having an option to wrap text such that it creates multiple lines will be helpful.

An example to illustrate this:

## test file
$ cat test.csv 
first_name,last_name,comment
John,Doe,short comment
Jane,Doe,somewhat long comment that I would like to see 
Jack,Doe,another not-so-long comment

## pretty print without column max width
$ csvtk pretty test.csv 
first_name   last_name   comment
----------   ---------   -----------------------------------------------
John         Doe         short comment
Jane         Doe         somewhat long comment that I would like to see 
Jack         Doe         another not-so-long comment

## pretty print with max col width
$ csvtk pretty -W 25 test.csv 
first_name   last_name   comment
----------   ---------   -------------------------
John         Doe         short comment
Jane         Doe         somewhat long comment tha
Jack         Doe         another not-so-long comme

## new option --wrap to wrap text in a column
$ csvtk pretty -W 25 --wrap test.csv
first_name   last_name   comment
----------   ---------   -------------------------
John         Doe         short comment
Jane         Doe         somewhat long comment 
                         that I would like to see
Jack         Doe         another not-so-long 
                         comment
avilella commented 1 year ago

I really like the idea of the -W 25 --wrap or --clip option.

I would use it often if it was available.

Bests,

On Tue, Nov 1, 2022 at 1:14 PM Vamsi Kodali @.***> wrote:

I think this option does not currently exist so this is likely a feature request. The -W option with csvtk pretty is quite useful when dealing with tables that have some really wide fields. However, I imagine it's not unusual for everyone to come across tables with (somewhat) wide fields for which we would like to see the data. Having an option to wrap text such that it creates multiple lines will be helpful.

An example to illustrate this:

test file

$ cat test.csv first_name,last_name,comment John,Doe,short comment Jane,Doe,somewhat long comment that I would like to see Jack,Doe,another not-so-long comment

pretty print without column max width

$ csvtk pretty test.csv first_name last_name comment


John Doe short comment Jane Doe somewhat long comment that I would like to see Jack Doe another not-so-long comment

pretty print with max col width

$ csvtk pretty -W 25 test.csv first_name last_name comment


John Doe short comment Jane Doe somewhat long comment tha Jack Doe another not-so-long comme

new option --wrap to wrap text in a column

$ csvtk pretty -W 25 --wrap test.csv first_name last_name comment


John Doe short comment Jane Doe somewhat long comment that I would like to see Jack Doe another not-so-long comment

— Reply to this email directly, view it on GitHub https://github.com/shenwei356/csvtk/issues/206, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGSN3HLK3G4T4R76L5SBLWGEJSLANCNFSM6AAAAAARUBOETY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

shenwei356 commented 1 year ago

Yes, the idea is great. But the package https://github.com/tatsushid/go-prettytable does not support cells containing new line characters, https://github.com/bndr/gotabulate seems to do. If there's no existing way, I can also implement one, but not today.

shenwei356 commented 1 year ago

Hi guys, finally, I have time to implement this! I created a new package to support this, and re-implemented the pretty command.

Examples

  1. Set the minimum and maximum width.

    $ csvtk pretty testdata/long.csv -w 5 -W 40
    id      name                 message
    -----   ------------------   ----------------------------------------
    1       Donec Vitae          Quis autem vel eum iure reprehenderit
                                 qui in ea voluptate velit esse.
    2       Quaerat Voluptatem   At vero eos et accusamus et iusto odio.
    3       Aliquam lorem        Curabitur ullamcorper ultricies nisi.
                                 Nam eget dui. Etiam rhoncus. Maecenas
                                 tempus, tellus eget condimentum
                                 rhoncus, sem quam semper libero.
  2. Clipping cells instead of wrapping

    $ csvtk pretty testdata/long.csv -w 5 -W 40 --clip
    id      name                 message
    -----   ------------------   ----------------------------------------
    1       Donec Vitae          Quis autem vel eum iure reprehenderit...
    2       Quaerat Voluptatem   At vero eos et accusamus et iusto odio.
    3       Aliquam lorem        Curabitur ullamcorper ultricies nisi....
  3. Change the output style

    $ csvtk pretty testdata/long.csv -W 40 -S grid
    +----+--------------------+------------------------------------------+
    | id | name               | message                                  |
    +====+====================+==========================================+
    | 1  | Donec Vitae        | Quis autem vel eum iure reprehenderit    |
    |    |                    | qui in ea voluptate velit esse.          |
    +----+--------------------+------------------------------------------+
    | 2  | Quaerat Voluptatem | At vero eos et accusamus et iusto odio.  |
    +----+--------------------+------------------------------------------+
    | 3  | Aliquam lorem      | Curabitur ullamcorper ultricies nisi.    |
    |    |                    | Nam eget dui. Etiam rhoncus. Maecenas    |
    |    |                    | tempus, tellus eget condimentum          |
    |    |                    | rhoncus, sem quam semper libero.         |
    +----+--------------------+------------------------------------------+
  4. Custom delimiter for wrapping

    $ csvtk pretty testdata/lineages.csv -W 60 -x ';' -S light
    ┌-------┬------------------┬--------------------------------------------------------------┐
    | taxid | name             | complete lineage                                             |
    ├=======┼==================┼==============================================================┤
    | 9606  | Homo sapiens     | cellular organisms;Eukaryota;Opisthokonta;Metazoa;Eumetazoa; |
    |       |                  | Bilateria;Deuterostomia;Chordata;Craniata;Vertebrata;        |
    |       |                  | Gnathostomata;Teleostomi;Euteleostomi;Sarcopterygii;         |
    |       |                  | Dipnotetrapodomorpha;Tetrapoda;Amniota;Mammalia;Theria;      |
    |       |                  | Eutheria;Boreoeutheria;Euarchontoglires;Primates;            |
    |       |                  | Haplorrhini;Simiiformes;Catarrhini;Hominoidea;Hominidae;     |
    |       |                  | Homininae;Homo;Homo sapiens                                  |
    ├-------┼------------------┼--------------------------------------------------------------┤
    | 562   | Escherichia coli | cellular organisms;Bacteria;Pseudomonadota;                  |
    |       |                  | Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;     |
    |       |                  | Escherichia;Escherichia coli                                 |
    └-------┴------------------┴--------------------------------------------------------------┘

Usage:

convert CSV to a readable aligned table

How to:
  1. First -n/--buf-rows rows are read to check the minimum and maximum widths
     of each column. You can also set the global thresholds -w/--min-width and
     -W/--max-width.
     1a. Cells longer than the maximum width will be wrapped (default) or
         clipped (--clip).
         Usually, the text is wrapped in space (-x/--wrap-delimiter). But if one
         word is longer than the -W/--max-width, it will be force split.
     1b. Texts are aligned left (default), center (-m/--align-center)
         or right (-r/--align-right).
  2. Remaining rows are read and immediately outputted, one by one, till the end.

Styles:

  Some preset styles are provided (-S/--style).

    default:

        id   size
        --   ----
        1    Huge
        2    Tiny

    plain:

        id   size
        1    Huge
        2    Tiny

    simple:

        -----------
        id   size
        -----------
        1    Huge
        2    Tiny
        -----------

    grid:

        +----+------+
        | id | size |
        +====+======+
        | 1  | Huge |
        +----+------+
        | 2  | Tiny |
        +----+------+

    light:

        ┌----┬------┐
        | id | size |
        ├====┼======┤
        | 1  | Huge |
        ├----┼------┤
        | 2  | Tiny |
        └----┴------┘

    bold:

        ┏━━━━┳━━━━━━┓
        ┃ id ┃ size ┃
        ┣━━━━╋━━━━━━┫
        ┃ 1  ┃ Huge ┃
        ┣━━━━╋━━━━━━┫
        ┃ 2  ┃ Tiny ┃
        ┗━━━━┻━━━━━━┛

    double:

        ╔════╦══════╗
        ║ id ║ size ║
        ╠════╬══════╣
        ║ 1  ║ Huge ║
        ╠════╬══════╣
        ║ 2  ║ Tiny ║
        ╚════╩══════╝