Closed infogulch closed 2 years ago
I think this should be some other tool. I'm quite certain they already exist as well, although I don't have a link handy.
But yeah, I'm not one for this kind of scope increase. Another alternative to the feature requests you reference is to only implement a subset of them. Or implement simpler versions of them.
It was never my intent for xsv to support arbitrary relational algebra, like that found in relational databases. In my view, if you need that, you should just use the relational database and cut out the middle man. Write a wrapper script if you must if it isntoo inconvenient today.
It sounds crazy, but hear me out.
More database-like functionalities are often requested. Features like arbitrary expressions, type awareness, advanced filtering and joining, aggregations, pivot/unpivot, etc. Sometimes subsets of these features are reasonable to consider for inclusion in xsv, other times the conclusion to such requests is to use a more fully-featured database like SQLite directly. While SQLite does support csv, it's only via an extension that must be compiled and loaded separately, and there's a lot of boilerplate to get it up and running. Enter the proposal:
What if xsv embedded SQLite and exposed a new subcommand that supports executing SQLite queries over csv files directly? I don't think such a tool exists today.
This would be a built-in escape hatch that instantly resolves the need to implement advanced database-like functionality, upgrading xsv from "powerful slicing tool for broad-stroke csv manipulation" to "complete csv management and analysis package". All told, it may be easier to implement this than all the advanced features currently considered and open in the issue tracker right now.
Some reasonable objections:
query
subcommand would be outside xsv's control, and SQLite may try to load the whole csv file into memory. That said, the best tool to split a large csv into many smaller ones for more advanced processing is quite nearby...Thoughts?