shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
999 stars 84 forks source link

Feature request: missing data handling in plotting functions #188

Closed taprs closed 2 years ago

taprs commented 2 years ago

Hi Wei, thank you for your great work!

It would be cool to make plotting functions work when there are rows with missing data. With csvtk v0.24.0 an error pops up when trying to plot a histogram with missing data:

$ echo $'1,1\n1,2\n,2'
1,1
1,2
,2
$ echo $'1,1\n1,2\n,2' | csvtk plot hist -f 1
[ERRO] fail to parse data:  at column: 1. please choose the right column by flag -f (--data-field)
shenwei356 commented 2 years ago

Added a new flag --skip-na to csvtk plot:

  --na-values strings    NA values, case ignored (default [,NA,N/A])
  --skip-na              skip NA values in --na-values

Rember to use -H for data with no header line:

echo $'1,1\n1,2\n,2' | csvtk plot hist -H -f 1 --skip-na | display

Binaries:

taprs commented 2 years ago

Hmm, but why not making it a default behavior, at least for empty cells?

shenwei356 commented 2 years ago

Hmm, that seems reasonable.