Open barjin opened 1 month ago
Indeed, CLI would need more love and has a lot of opportunities to improve the DX!
Two points from my side:
Thank you for a great proposal, @barjin! Looking at it, I have several thoughts to kick-off the conversation with:
apify vis
command? That is a useful functionality I have benefited from. apify validate
namespace and than input-schema
sub-command maybe?local
, except run
and task
namespace for obvious reasons. :)apify actor execute
or apify run create
instead of actor run
. I'd suggest preventing the overload of the term Run. The Run (noun) is and Actor executed in the Apify Platform.apify build
namespace? I think the attach
or ssh
command would be super helpful here during the development as well. When the build fails, I'd love to have a window I could connect to it and debug it remotely. (the docker run --rm -it --entrypoint bash <image:tag>
analogy)Just fyi, in Actor Whitepaper and SDK/clients, we took care to use consistent naming for "runs", "call", "start" etc. It's important for CLI to be consistent with this too.
- Do you intent to remove the
apify vis
command? That is a useful functionality I have benefited from.apify validate
namespace and thaninput-schema
sub-command maybe?
No plans to remove any commands, max re-organize existing ones and add missing ones.
Not sure if it makes sense to have a validate
scope... Ideally it'd also go into the actor
scope imo
- Regarding the current CLI design, my biggest confusion is there isn't clear expectation setting whether the action is going to be performed locally or remotely. Shouldn't it be addressed ? What expectation should it default to for all commands? I'd be very specific in every command description whether it's local or remote. I'm almost inclined almost everything should default to
local
, exceptrun
andtask
namespace for obvious reasons. :)
When it comes to actors, everything is on platform except apify run
(and some other unrelated commands). I can definitely see why Actor Run would cause confusion. I'll need to recheck if the whitepaper covers this
- I'm thinking
apify actor execute
orapify run create
instead ofactor run
. I'd suggest preventing the overload of the term Run. The Run (noun) is and Actor executed in the Apify Platform.
Both of this still have the issue of "where does this run". apify call
makes the actor run on the platform. apify run
runs the actor locally as if it was on platform. apify actor execute
sounds like platform to me, same with apify run create
. Maybe apify local run
would make more sense? cc @jancurn
- What about the
apify build
namespace? I think theattach
orssh
command would be super helpful here during the development as well. When the build fails, I'd love to have a window I could connect to it and debug it remotely. (thedocker run --rm -it --entrypoint bash <image:tag>
analogy)
+1 to build namespace, definitely if we want to cover the versioning actor part of our api.
But attach/ssh are features that the platform (to my knowledge) don't have right now, and that I doubt will come... At max, maybe we should have a command that simulates an actor build locally? (so all the steps the platform would do to build an image), but that requires extra setup from users (Docker, etc).
Thank you for a great proposal, @vladfrangu!
Touché, but I'll let it slip for now 😄
run
is overloaded
That's true - my motivation behind apify actor run
were all the CLI tools I've used in the past 5 years (Docker, Go compiler, Cargo) - they all have the xxx run
command that... well, runs stuff. Imo it would be a shame if we had to go with something like execute
- which, e.g. in Docker, has different semantics. (actor execute
also sounds like something from the USSR's Great Purge period (: )
It also made me think - apify run
currently runs the Actor locally (in line with all the other CLI tools above), so having all the apify run ls
/ apify run rm
(which would list the "Run instances" on Platform) might get confusing for some.
It's a real pickle, but I still think that apify actor run
is the cleanest way out.
whitepaper
We'll need to support actor call
for calling Actors on the platform by name - all our other tools do have that. Maybe it's not that much of a problem, though:
apify actor run
could run the current Actor locally (without options) or remotely (with, let's say, --remote
flag).
apify actor push && apify actor call [actor name]
with every local change, just apify actor run --remote
and wait and watch. (we could always force this by apify actor run --remote --force-build
)apify actor call --input ... [actor name]
would just find the Actor by name on the platform and run it (useful if you just want the data scraped by a third party Actor).
build
namespace
Sounds good to me, one small thing - if the build fails (even in Docker), you cannot really attach to anything, right? Having some sort of ssh
to Actor would be sick, but probably not doable right now as @vladfrangu mentions... But we're still exposing that one http port... maybe Cloud console alá GCP? This would be hard to standardize, though. For starters, I would be happy with just the Actor's stdout
being redirected to my (local) terminal.
I suppose apify actor run
and apify actor call
is fine and the latter consistent with Actor.call
in the Apify SDK. But we need to keep apify run
and apify call
for backwards compatibility anyway :)
For reference, working document is now at https://www.notion.so/apify/New-CLI-Design-a8751a53896e472a9c8f474669f6f5d5
The current state of the Apify CLI API is dubious. As a user, I’m always doubting the difference between
apify run
andapify call
, whetherapify create
does something locally or remotely, why doesapify actor:get-input
exist… etc.I also like to think about things in hierarchies. For any system with more than 3 features, hierarchy is imo everything. Even Apify (Console) has it - there are separate tabs for Actors, Tasks, Proxy settings, Storages etc.
The current CLI API doesn’t really reflect this.
If you look at any other CLI tool that works with similar resources, they are miles ahead - see
docker
/podman
, AWS CLI etc.Some (incomplete) examples / ideas:
apify actor info
.actor
file and logs infoapify actor create --template=[] [name]
apify actor init [name]
actor init
nowapify actor push
apify actor pull
apify actor run [--remote] [--input=INPUT.JSON] [actor id]
run
andcall
combined (switch with the--remote
flag or by setting theactor_id
)apify actor ls
apify actor rm [actor id]
apify actor build [actor id]
apify task ls
apify task rm [actor id]
apify task schedule [task id] [cron string]
apify task create
/add
/ …?apify run ls [--active|finished|aborted|...] [--actor-id=id]
apify run rm [run id]
apify run attach [run id]
stdout
to the users terminal, I don't think we can dostdin
. Still, it would be cool!apify run resurrect [run id]
apify run abort [run id]
apify run abort $(apify run ls --active -q)
?apify kvs create [name]
apify kvs ls
apify kvs ls [kvs id]
apify kvs rm [name]
apify kvs rename [name]
apify kvs set [--bucket-id=ID] --key=[KEY] value
put-object
actor:get-input
apify kvs get [--bucket-id=ID] --key=[KEY]
get-object
apify dataset create / ls / rm / rename
apify dataset get [--limit] [--offset] [--format=(json|csv|xml|...)] [dataset-id]
apify dataset push [--dataset-id=[id]] value
IMO this would send the CLI usability through the roof, inviting actual power users to use us through the command line. Also, most of the commands would just be straight API calls (and none of them clash with the current ones, so no breaking changes, only deprecating the old commands).
CC @B4nan @jancurn @vladfrangu what do you think?