tfeldmann / organize

The file management automation tool.
http://organize.readthedocs.io
MIT License
2.23k stars 128 forks source link

JSON output or verbose mode #228

Closed carpii closed 1 year ago

carpii commented 2 years ago

Is your feature request related to a problem? Please describe.

The output of organize is quite 'noisy' and also pretty printed.

This is great if you are just running a couple of small tasks, but quickly becomes difficult to read when you are running a full set of tasks and are only interested in the summary of tasks which actually did something.

Describe the solution you'd like

One option would be to hide any tasks (and subsequent directory lines), unless those tasks actually ran and produced output. Maybe there could be a --quiet switch to control this

Describe alternatives you've considered

Another idea (and my preferred one) would be to have an option to output everything as structured JSON It would still pretty print to stdout by default, maybe that could just be a default outputter that consumes the JSON

This way everyone could filter and present the output any way they want (ie, filtering out tasks with no output, conditionally output from shell commands or echos etc).

Not sure how much work this would involve, but admittedly it's not a trivial change.

I'd pledge a $300 bounty for the JSON idea be implemented (by the author), or by anyone else (but only if the author agreed to the change).

tfeldmann commented 2 years ago

Sounds like a good idea and would be perfect for integration tests! I'll see what I can do. Do you have a JSON structure in mind?

carpii commented 2 years ago

Hiya, thanks for being receptive to the idea

I don't have a firm idea of the JSON structure in mind, but my initial thoughts are kinda similar to the indenting on the console output

I'm not familiar enough with all the features to know if this could cover everything.. But perhaps something like this?

{
  "tasks": [{
      "name": "Copy PDF somewhere",
      "files": [{
        "filename": "doc.pdf",
        "location": "/home/user/Downloads",
        "actions": [{
            "type": "echo",
            "output": "testing"
          },
          {
            "type": "shell",
            "command": "~/bin/organize-pdf-ocr \"resolved_path_from_{{path}}\"",
            "exitcode": 0
          }
        ]
      }]
    },
    {
      "name": "Process OFX files",
      "files": [{
        etc
        etc
      }]
    }
  ]
}

Just a few ways I'd plan to use this..

Use the JSON to produce and format the output (filtering some of the echo commands out, but still having them present for troubleshooting if I run org in shell mode)

Suppress output of individual tasks, unless I detect a shell command had an error in which case output all of it

Run from a cron or nodered, suppress output unless there was an error, in which case capture it (or use output to display up a desktop notification etc)

Thanks

tfeldmann commented 2 years ago

I thought about this and one problem I see is that the JSON output only becomes valid after the run is completed. I think it would be nice to still see intermediate output. This can be done by streaming a single valid json object per line (JSONL format).

For example:

{"type": "START", "version": "2.3.0", "config": "~/config.yaml", "working_dir": "."}
{"type": "ACTION", "rule": {"nr": 1, "name": "Copy PDF somewhere"}, "location": "/home/user/Downloads", "file": "doc.pdf", "action": "echo", "args": {"msg": "testing"}}
{"type": "ACTION", "rule": {"nr": 1, "name": "Copy PDF somewhere"}, "location": "/home/user/Downloads", "file": "doc.pdf", "action": "shell", "args": {"command": "~/bin/organize-pdf-ocr ...", "exitcode": 0, "output": "Complete."}}
{"type": "SUMMARY", "failed": 0, "success": 2, "total": 2}

This has some advantages:

Disadvantages:

Thoughts?

carpii commented 2 years ago

I could work with this format. I hadn't considered the buffering problem to be honest.

Would nr be a unique rule id? I normally parse JSON by piping into the jq tool, but with a unique ID I think I can use jq --slurp to read JSONL

tfeldmann commented 2 years ago

I just tried it. jq works nicely with JSONL. If you want a single big json object after execution --slurp works great.

I found some more info here: https://zxvf.org/post/jq-as-grep/

EDIT: Yes nr would be the rule number (two rules can have the same name)

carpii commented 1 year ago

Hi, is this feature still being considered, or unlikely to be developed?

I'll close the ticket if not, and look into other solutions (maybe a fork or porting it to golang)

Thanks

tfeldmann commented 8 months ago

JSON output is now released with organize v3. organize run --format=JSONL

carpii commented 8 months ago

Awesome, thanks for continuing to work on this. I had written it off, thinking it was maybe unfeasible

I'll give it a good testing late next week, and will still happy to honor my bounty if it helps solve the problems I was having

Thanks

tfeldmann commented 8 months ago

You might like the latest version on the main branch :) There is a new command line option --format=errorsonly which does exactly what you described.

carpii commented 8 months ago

This all seems to be working very well. Thanks so much for updating organize

I'll most likely start consuming JSONL and just parse errors out of that, but the errorsonly format is still pretty handy

Have just sent my PayPal bounty shortly, cheers :)