pydiverse / pydiverse.pipedag

A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
https://pydiversepipedag.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
27 stars 3 forks source link

pipedag-manage command to list all instances #105

Open windiana42 opened 1 year ago

windiana42 commented 1 year ago

It might be nice to have a command to list all instances, so the other commands can be for-loop automated to operate on a set of instances that can be either a pattern matched include list or a pattern matched exclude list. It would also be nice to have an option to receive instance_id/per-user combinations for which a metadata schema exists in the database. The per-user aspect does not need to be part of output tuples since tuples are harder to process with a pipe (not impossible though). It is also fine to retrieve the list twice, once for per-user and once for not per-user.

Alternatively, or additionally, we could include pattern matched include and exclude capabilities in the other commands and have them print the set of matched instance_id/per_user combinations before asking whether to continue.

windiana42 commented 1 year ago

The current confirm-message of clear-metadata is something like "Do you really want to clear all metadata?". We should repeat mentioning instance_id and username and ideally also the resulting schema name.

windiana42 commented 1 year ago

It might also be nice to show the metadata version of all instances mentioned in pipedag.yaml. This could be combined with the option to delete all metadata which is on an outdated version.