Open dclong opened 6 years ago
q
follows UNIX philosophy, therefore it's supposed to be predictable rather than trying to be smart. That also means it works great together with other UNIX tools. If a tool breaks that contract by outputting superfluous ornaments you need to write a special filter for them to massage the output into a proper format. But even such tools often have a flag to produce a machine-readable output, therefore you should study their documentation carefully before investing time in writing a wrapper.
Why would that be an advantage? You could as well have a table with _1
, _2
etc as column names. And if you have column names you might as well read them in with -H
.
Hi @dclong , I'm glad you find q useful.
I agree with @bitti regarding fitting the unix philosophy, I believe that data cleanup/formatting should be done by using other, more suitable, linux tools. q doesn't attempt to replace the entire linux command-set, quite the contrary, at least from my viewpoint.
Regarding the automatic column name, I don't really believe that there's a significant difference between choosing cN and _N, both are short enough and anyone can get used to either without any implications.
As for "auto-completing" column names, I believe that something like that would be less predictable and stable. For example, think about a query which uses a column prefix (e.g. phone
) that runs on a csv file. If the user will run the same query on a newer version of the csv which contains another column with the same prefix (e.g. phone2
), then the query would either break or provide incorrect results. Such behavior is not something I would expect from a tool such as q, and the benefits of such auto-completion are marginal in my view.
The main advantages of q from my point of view are:
I'm not expecting q's traits to be suitable for every use case. All I can hope is that some users will find it useful for them.
Harel
The tool q is useful but has limitations. I think the following features can be really useful additions to it.
_n
is probably better thancn
(where n is an integer) as users might have text files usingc1
,c2
, etc. as column names. It's even better if we can reference a column using first k characters of the column names if it can already unique identify a column. For example, if a column is namedrepository
, allow users to userep
to reference the column is there's no other columns starting withrep
.Many people use shell because it's quick and dirty. If q is not "quick", there's not much advantage over paring results using awk or even Python.