shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
999 stars 84 forks source link

Data replacement #205

Closed MostafaYA closed 1 year ago

MostafaYA commented 1 year ago

Prerequisites

Describe your issue

Thank you

Hi, I want to convert table A (below) to a binary table (Table B). Thanks

Table A

sample Gene 1 Gene 2 Gene 3 gene N
Sample 1 yes - incomplete framshift
Sample 2 noncoding - - -
Sample 3 other text - - -
Table B sample Gene 1 Gene 2 Gene 3 gene N
Sample 1 1 0 1 1
Sample 2 1 0 0 0
Sample 3 1 0 0 0

I'm grateful to users who have greatly helped to report bugs and suggested new features.

I may respond to issues or fix bugs quickly, but I usually implement new features periodically (two or more weeks).

shenwei356 commented 1 year ago
$ cat t.txt \
    | csvtk replace -t -f 2-5 -p '^-$' -r 0  \
    | csvtk replace -t -f 2-5 -p '[^0]+' -r 1

sample  Gene 1  Gene 2  Gene 3  gene N
Sample 1        1       0       1       1
Sample 2        1       0       0       0
Sample 3        1       0       0       0
MostafaYA commented 1 year ago

it works except when the cell value contains "0" here is another example

cat example
sample  Gene 1  Gene 2  Gene 3  gene N
Sample 4    gyrB_V130I  gyrB_V1100I -   -
cat example | csvtk replace -t -f 2-5 -p '^-$' -r 0  | csvtk replace -t -f 2-5 -p '[^0]+' -r 1 
sample  Gene 1  Gene 2  Gene 3  gene N
Sample 4    101 1001    0   0
shenwei356 commented 1 year ago

Use a special charactor.

cat t.txt \
    | csvtk replace -t -f 2-5 -p '^-$'   -r @ \
    | csvtk replace -t -f 2-5 -p '[^@]+' -r 1 \
    | csvtk replace -t -f 2-5 -p '^@$'   -r 0

sample  Gene 1  Gene 2  Gene 3  gene N
Sample 1        1       0       1       1
Sample 2        1       0       0       0
Sample 3        1       0       0       0
Sample 4        1       1       0       0