Add JSON-Format output parameter

thorstenspille commented 3 years ago

Add a parameter to print output in json format.

s-m-e commented 3 years ago

Narrowing down the design - based on my demo screenshot:

One complete, closed JSON-string per line of output - and one line per item, i.e. transaction, I presume is best (?).

I can do both, JSON output for "future" transaction, i.e. before a user may (optionally) confirm the transactions with a "yes" - and JSON output for running transactions.

Is an (optional) explicit confirmation by a user (or another bash script) a relevant feature when using JSON output?
If yes, should the question for the "user" also be provided in some machine-readable form, i.e. JSON?
Once the transactions are executed, I can emit one JSON string before the transaction (i.e. what is about to happen) and one after (i.e. what actually happened - some status). Does this make sense?

thorstenspille commented 3 years ago

Hi @s-m-e, primarily JSON output is relevant for automation and monitoring, so I think the best solution would be printing one JSON dictionary including all relevant information after the execution has finished.

My idea is:

print the configuration, to be able to see default values undefined in config.yaml or maybe a separate subcommand config to dump the full configuration
merge query and execution into one dict per transaction
add performance metrics to to every transaction and to summary for overall process (added some random values in the example)
execution with --json parameter should be without any need for user interaction
for testing you could add a --dry-run parameter, which only queries the data

Here's an example output for snapshot creation: abgleich snap config.yaml --json

{
    "config": {
        "source": {
            "zpool": "data_ssd",
            "prefix": "ernst",
            "host": "localhost",
            "user": ""
        },
        "target": {
            "zpool": "data",
            "prefix": "BACKUP_E9/ernst",
            "host": "anonymous",
            "user": "backup"
        },
        "include_root": true,
        "keep_snapshots": 2,
        "always_changed": false,
        "written_threshold": 1048576,
        "check_diff": true,
        "suffix": "_backup",
        "digits": 2,
        "ignore": [
            "some/ignored/subset",
            "some/ignored/folder"
        ],
        "ssh": {
            "compression": false,
            "cipher": "aes256-gcm@openssh.com"
        }
    },
    "transactions": [
        {
            "type": "snapshot",
            "zvol_type": "subvol",
            "dataset_subname": "",
            "snapshot_name": "2020071408_backup",
            "bytes_written": 2306867,
            "checked_diff": false,
            "create_snapshot": true,
            "status": "OK",
            "command": "zfs snapshot data_ssd/ernst@2020011408_backup",
            "timestamp_created": 1594713480,
            "query_duration_seconds": 0.203,
            "exec_duration_seconds": 0.020,
            "error_message": ""
        },
        {
            "type": "snapshot",
            "zvol_type": "dataset",
            "dataset_subname": "FIREFOX",
            "snapshot_name": "2020071410_backup",
            "bytes_written": 937.984,
            "checked_diff": true,
            "create_snapshot": true,
            "status": "OK",
            "command": "zfs snapshot data_ssd/ernst/FIREFOX@2020011410_backup",
            "timestamp_created": 1594678200,
            "query_duration_seconds": 0.203,
            "exec_duration_seconds": 0.009,
            "error_message": ""
        },
        {
            "type": "snapshot",
            "zvol_type": "dataset",
            "dataset_subname": "PROJEKTE/prj.ZFS",
            "snapshot_name": "2020071407_backup",
            "bytes_written": 4613734,
            "checked_diff": false,
            "create_snapshot": true,
            "status": "OK",
            "command": "zfs snapshot data_ssd/ernst/PROJEKTE/prj.ZFS@2020011407_backup",
            "timestamp_created": 1594678900,
            "query_duration_seconds": 0.203,
            "exec_duration_seconds": 0.036,
            "error_message": ""
        },
        {
            "type": "snapshot",
            "zvol_type": "dataset",
            "dataset_subname": "THUNDERBIRD",
            "snapshot_name": "2020071409_backup",
            "bytes_written": 5242880,
            "checked_diff": false,
            "create_snapshot": true,
            "status": "OK",
            "command": "zfs snapshot data_ssd/ernst/THUNDERBIRD@2020011409_backup",
            "timestamp_created": 1594677600,
            "query_duration_seconds": 0.203,
            "exec_duration_seconds": 0.039,
            "error_message": ""
        }
    ],
    "summary": {
        "num_zvols": 4,
        "snapshots_skipped": 0,
        "snapshots_created": 4,
        "snapshots_failed": [],
        "overall_runtime_seconds": 1.325,
        "overall_queries_seconds": 0.812,
        "overall_creation_seconds": 0.513,
        "success": true
    }
}

s-m-e commented 3 years ago

Ok, then there is just one "drawback" to your idea. Should the command fail or be interrupted while running, two scenarios might occur: The output is not valid JSON because part of it is missing or there is no output at all because the output is being serialized into JSON after the completion of all transactions. Either way, I'd prefer to output multiple independent blocks of JSON, e.g. one for the configuration and one per transaction - separated by new-line characters. The summary could be a another JSON block at the end. For easier parsing, I'd suggest that all JSON blocks use a common pattern / schema, indicating what type of message they contain (configuration, transaction or summary). Does this idea make sense?

pleiszenburg / abgleich

Add JSON-Format output parameter #30