Open lread opened 3 months ago
I think having the important settings/overrides going to stdout along with the findings is fine for -o stdout
and -o stdout-simple
-- that's actually what we rely on at work.
For the other output formats, that's more problematic (and has in fact been broken for a long time, since I added a bunch of println
output that no one has complained about!).
I don't know the right choice here but option 3 will break our flow at work.
Maybe everything that is println
'd today should be log/info
'd if the output format is not stdout(-simple) then that wouldn't break my flow and you'd fix the other formats which were broken in v5?
So the existing printlns are important for you to redirect (or tee) for your use case. Are they all important? Here's the current println output I see after a local run on my dev box:
Read 119 dependency-check properties.
Merging additional properties:
# clj-watson.properties file
nvd.api.key=31a0****-****-****-****-********5002
The idea I snuck in at the end:
I suppose we could include a summary of settings as part of the findings report.
...was hinting that any important printlns could be moved (or also included if we also want to log them) in the report.
Maybe add a :settings-used
and/or :result-summary
.
And.. this new data would be spit out with the stdout
and stdout-simple
free format reports.
But... current structured edn
and json
reports are not extensible in this way, they immediately spit out a list/vector of findings without any parent key. So doing so would mean a breaking change.
And maybe this idea does not fit with sarif
(which I am not familiar with yet).
Right, that's why I just suggested toggling it from println
for free format to logging (info) for structured formats. A pragmatic fix for the structured output that doesn't break any existing stdout-based usage. And sets a precedent for any future println
output, such as displaying the number of findings etc (which also can't go in the structured formats).
I have no idea what people might be using those edn
, json
, or sarif
outputs for -- although no one has complained about them being broken by me adding that println
stuff. Maybe no one uses them?
I think if we introduce an output file option, there would be a good argument for leaving the println
stuff as-is in all formats and writing only the findings report to the file.
I was thinking that since the println outputs are important to you, they might be important to others, and if they are not moved to the findings report, they could not be easily saved when using structured formats.
But also, yes, I think your idea sounds like a reasonable compromise. And like you say, nobody has complained, so we might be putting effort into an unused feature.
We don't know if folks are using structure reports (we are guessing no because of the extra printlns), but we can assume that nobody was using JSON because it was not being output as JSON #117.
I agree and that strongly suggests no one is currently relying on JSON, EDN, or Sarif output. Which makes me inclined to punt on this until someone asks for it.
But my sense of the right thing to do for this is:
--output-file
option--output-file
--output-file
is provided, send the core report output to that file -- i.e., do not send the cosmetic println
stuff you and I have added in 5.x or 6.x to that file: those will continue to go to regular stdout
That would provide a clear separation of logging (stderr
), the tool's own chatty, informational stuff (stdout
), and the actual vulnerability report (file).
Thoughts?
The awkwardness of this nags at me a bit, so I am tempted to resolve it, as best we can, now. Lemme try to build on your idea.
How about this?
--output-file
is specified and does not include a recognized --output
format as its filename extension, fail fast. Otherwise, use the filename extension as the output format written for the output file.--output-file
and --output
format are specified, output to file in --output
format.--ouput-format
is specified report is sent to stdout (structured or not, we don't care).-f
is already taken, so we can't use it as the short form for --output-file
.
One more alternative:
--output
to also optionally recognize a filename. This doesn't let you output a report to a file with an arbitrary extension, but it also doesn't add an extra cmd line option and doesn't have us scratching our heads as to what its short form might be.
stdout
and stdout-simple
.
I could argue these were never good names. They could be deprecated (no longer documented but recognized) and replaced with txt
and txt-brief
(or something you prefer, but these seem ok).I like idea refinement 2 with unstructured format renames.
Just to check, since --output
defaults to stdout
, you're saying if --output-file
has an extension that isn't .json
, .edn
, .sarif
, or ... .txt
, then omitting --output
should be an error?
--output-file path/to/file.json
=> --output json
implied--output-file path/to/file.edn
=> --output edn
implied--output-file path/to/file.sarif
=> --output sarif
implied (is this even a valid file extension? what file format are Sarif reports normally?)--output-file path/to/file.txt
=> --output stdout
impliedI'd be happy with -O
for --output-file
as it sort of compliments/overrides -o
/ --output
maybe?
.txt-brief
is not a "valid" file extension so I don't think we should use it.
My 2nd refinement idea suggested no new --output-file
and enhancing --output
.
-o edn
- outputs edn to stdout (same as today)-o path/to/file.edn
- outputs edn to path/to/file.edn
-o txt
- outputs current stdout
format to stdout-o path/to/file.txt
- outputs current stdout
format to path/to/file.txt
-o blarg
- fails fast - unrecognized format-o path/to/file.blarg
- fails fast - unrecognized formatBut, happy to have holes poked in this. Is it too magic or non-obvious?
It's a bit tricky to work around this awkwardness we've inherited.
Maybe your idea of -O
, --output-file
is clearer (with inferring format from extension when o
, --output
is not specified).
Is .txt-brief
problematic in that it would not be recognized in IDEs and other software?
What about .brief.txt
?
If we go for separate --ouput
and --output-file
options, since --output
is so badly named, should we consider a rename to --format
? (While still keeping short form -o
because -f
is taken, and still recognizing --output
for compatibility).
Does .sarif
make sense as a file extension? SARIF is a specific JSON format (I finally went and looked it up), so the natural file extension would be .json
.
Given stdout-simple
and sarif
are both variant formats of other reports (stdout
and json
respectively), I don't see how we can have a single output option that covers everything, without some weird contortions.
So, I still prefer -o
for the output format and -O
/ --output-file
for the destination file path. We could make the long form --output-format
(and continue to support, but deprecate, --output
).
It seems .sarif
is the preferred filename extension:
https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790671 but .json
is an ok 2nd choice.
I think your idea of renaming --output
to --output-format
is a good compromise.
I also agree that not inferring by filename extension is the safer choice. It is a tad more typing for the user, but the options are not intermingled.
A user might be surprised/annoyed that when they've specified --output-file foo.edn
that the default stdout
format is used, but we don't have to solve all problems in this one issue.
Inspired by DependencyCheck (and nvd-clojure, which writes DependencyCheck reports), always write all reports.
So a run would always write:
target/clj-watson-findings.sarif
target/clj-watson-findings.json
target/clj-watson-findings.edn
target/clj-watson-findings.txt
target/clj-watson-findings.brief.txt
Control over the output directory would be nice, but report filenames would be fixed.
We could then deprecate --output
entirely but would likely introduce --output-dir
.
Or... not deprecate --output
and leave it as is to avoid breaking workflows for folks.
Currently
Findings and other clj-watson messages are written to stdout. (And interesting logging from deps is written to stderr).
This seems fine for unformatted output types
stdout
andstdout-simple
.But...
Format types
json
,edn
andsarif
probably want to go to a file or be piped.Maybe that's why clj-watson was suppressing all log output before?
Options
Option 1: Do nothing
Maybe we are cold, maybe we don't care.
Option 2: Provide an argument to output findings to a file
A bit of a sad thing is that
-o --output
is being used for output type. This option is typically used to specify and output file.Option 3: Only write vulnerability findings to stdout
Write everything else to stderr.
Proposal
I guess option 3 as a minimum (but could also do option 2). But maybe you want to write important settings (see #79) and findings to a file?
I suppose we could include a summary of settings as part of the findings report.