brimdata / super

An analytics database that puts JSON and relational tables on equal footing
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.39k stars 64 forks source link

Add mechanism to pass external arguments into Zed program #2223

Open philrz opened 3 years ago

philrz commented 3 years ago

Consider the following, invoked with zq commit 82630a52.

$ cat mbcalc.zs 
const MB=1024*1024
uid=CfogvV3QTDagvaIzBl | put orig_resp_mbytes=resp_bytes/MB | cut uid,resp_bytes,orig_resp_mbytes

$ zq -Z -I mbcalc.zs conn.log.gz
{
    uid: "CfogvV3QTDagvaIzBl" (bstring),
    resp_bytes: 1543916493 (uint64),
    orig_resp_mbytes: 1472
} (=0)

As a user, what I'd like to be able to do next is express the uid=CfogvV3QTDagvaIzBl part on the command line, since it's the "thing I would change a lot", while the leading const as well as the downstream put are "the things that gets re-used a lot as-is".

@mccanne proposed the following approach, which I agree would do the trick:

$ cat mbcalc_mod.zs
const MB=1024*1024
uid=args[0] | put orig_resp_mbytes=resp_bytes/MB | cut uid,resp_bytes,orig_resp_mbytes

$ zq -Z -I mbcalc.zs  conn.log.gz -- CfogvV3QTDagvaIzBl

That is, making args[] an array would give the opportunity to have multiple separate pieces of CLI-provided Zed invoked throughout the included Zed script, which seems handy.

philrz commented 3 years ago

A community user came up with a request for similar functionality. In their own words:

Had an idea this morning while working on a zed script and was thinking about a way to use scripts over and over again but they are hardcoded with the pool names.  I know I can use a bash wrapper script but I wonder if zed scripts could accept command line arguments?  For instance, the sharkfest presentation that Steve gave used a small script that's good as the example. So bash has the ability to use command line arguments and access them with $1, $2...etc .  This way I can change the command line arguments and easily reuse the zed script without a wrapper.

zed query -I join-badguys.zed demo.pcap BadGuys | zed load -

$cat join-badguys.zed
from (
  $1@main => sort id.resp_h ;
  $2@main => sort addr ;
)
| inner join on id.resp_h=addr badguy:=true
| _path:=has(_path) ? "badguy:"+_path : "badguy"

I don't know how much of a pain this would be to implement.  It would makes things a bit easier but also anyone that's playing with zed scripts can probably make a wrapper in whatever language they are familiar with.

philrz commented 5 months ago

A community zync user also just requested this functionality, though in their case they're submitting the queries over the Zed lake API rather than the command line. In their own words:

Can anyone tell me if it is possible to receive arguments in a Zed query? ...how to parameterize a value that is coming from outside of the 'from'. For instance, in this example, is it possible to receive a value coming from the API call and pass it as a parameter to the find_liked operator? In this modified example, arg_taste would be received when the query is called.

op find_liked(taste): (
   likes==taste
)

from people.ndjson
| find_liked(arg_taste)

@nwt pointed out that SQL APIs like ODBC and JDBC have the moral equivalent so our API eventually should too.