Open max-mapper opened 10 years ago
Given what gasket is intended to do, I think the first option seems best.
I think the goal of gasket should be to be as explicit and low level as possible... (im channeling my inner @mafintosh here)
Reading the three options above I kind of dont like any of them now. i'd prefer something like this:
{
"gasket": {
"main": [
{
"command": "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
"type": "serial"
},
{
"command": "gasket run import -- http://www.fcc.gov/files/ecfs/14-28/14-28-RAW-Solr-1.xml",
"type": "serial"
}
]
}
}
}
e.g. where everything is explicit. that way the gasket.json
becomes lower level, and we can worry about user-friendliness in areas like this
we would have to define all of the different types, e.g. https://github.com/datproject/datscript/blob/master/example-bionode.ds#L7-L11
I actually really prefer this to the previously mentioned approaches.
So would the different types correspond to run, then, pipe, fork? For example, if a user uses "then", the type would be "serial"? If a user uses "run", the type would be "parallel"?
@maxogden i like this. so all gasket commands are just simple non nested arrays right (no commands in commands)? and in case you need to nest them you would split them into separate gasket pipelines?
is it more common for people to run parallel or serial jobs?
Would it be too weird if we inspected the commands looking for '|' or '&&' at the end to decide whether it pipes or serializes?
hey @gtramontina ! that could work fine. we aren't exactly sure who is going to use gasket yet. we are trying to build it as an easier abstraction than bash, and so leaving those characters out was the first idea. one complication I see is that | or && could mean "or"/"and" respectively depending on the user's background
consider this use case:
https://gist.github.com/maxogden/80de2ba6a6f52ff382e3
the
null
s are currently the only way to tell gasket to run the pipeline one at a time (serially). if thenull
s are removed then all of thegasket run import --
lines would be spawned at once, which technically works but causes my computer to almost dieso what would be a better api for disabling the auto pipe mode?
ideas:
1: make the
main
pipeline an object instead of an array and add an option to change behavior, e.g.:instead of
"serial": true
it could be"parallel": false
or"pipe": false
2: make
"pipe": false
by default. then you could just do this:and they would spawned/run one at a time and not get piped to each other. to get them to pipe together you would have to use the syntax from option 1
3: have 2 top level default keys for 'parallel' and 'serial' commands
e.g. in the above doing
gasket run pipe
would act differently fromgasket run serial
(this one might be too magic). also i don't like the namesserial
andpipes
that muchthoughts?