shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
1k stars 84 forks source link

csvtk join of one column in file A to many possible columns in file B #149

Closed avilella closed 3 years ago

avilella commented 3 years ago

Hi all,

Another question about how to best perform a query in csvtk:

I have a file A with a set of DNA sequence, some with prefixes of the others, and I want to pair up columnH and columnL of fileA with many possible column pairs in file B, e.g.

csvtk join -k -f "columnH,columnL;column1H,column1L" fileA fileB

So far so good, but now I want the ones that didn't match to be reattempted with columns "column2H,column2L" in fileB, then again the ones which don't match, reattempted against "column3H,column3L" in fileB, etc.

In PostgreSQL, this would be a matter of using the OR statement to do the table JOINs.

I am wondering if there is an equivalent in csvtk that can be concocted for this,

Thanks in advance,

shenwei356 commented 3 years ago

Looks like a specific situation and hard job for join.

csvtk grep support searching on multiple columns in an OR-like way.