Up until now (post-concurrency changes), we've been rendering results in a deterministic way by storing all of the file paths in a slice, sorting the slice, and then iterating through a map using the sorted keys. This prevents any output feedback while a scan is actually occurring and is especially apparent when scanning directories with many files.
This PR re-adds real-time streaming of results which will not be deterministic (which we probably shouldn't care about in general), but I added explicit sorting to two of our tests to ensure that the result data is what we expected. An unintended improvement of this PR is that --err-first-hit and --err-first-miss now work like they're supposed to.
I also verified that --err-first-hit and --err-first-miss also work with this implementation:
$ for i in (seq 1 3); go run cmd/mal/mal.go --err-first-hit analyze /usr/bin/; end
🔎 Scanning "/usr/bin/"
👋 "/usr/bin/SafeEjectGPU": matched requested condition
🔎 Scanning "/usr/bin/"
👋 "/usr/bin/SafeEjectGPU": matched requested condition
🔎 Scanning "/usr/bin/"
👋 "/usr/bin/SafeEjectGPU": matched requested conditione /Library/Application\ Support/BTServer/; end
$ for i in (seq 1 3); go run cmd/mal/mal.go --err-first-miss analyze /Library/Application\ Support/BTServer/; end
🔎 Scanning "/Library/Application Support/BTServer/"
👋 "/Library/Application Support/BTServer/pincode_defaults.db": matched requested condition
🔎 Scanning "/Library/Application Support/BTServer/"
👋 "/Library/Application Support/BTServer/pincode_defaults.db": matched requested condition
🔎 Scanning "/Library/Application Support/BTServer/"
👋 "/Library/Application Support/BTServer/pincode_defaults.db": matched requested condition
I also fixed up a data race condition, simplified how file reports are stored, and improved the output of --stats to make more sense. As part of the files map simplification, I added code to the JSON and YAML renderers to convert the sync.Map data to a format that works with the respective Marshal functions.
Converting this to a draft to refactor the refactor to use standard Golang constructs since we aren't concerned with output determinism. I also found a data race in diff.go which needs to be fixed.
Closes: #489
Up until now (post-concurrency changes), we've been rendering results in a deterministic way by storing all of the file paths in a slice, sorting the slice, and then iterating through a map using the sorted keys. This prevents any output feedback while a scan is actually occurring and is especially apparent when scanning directories with many files.
This PR re-adds real-time streaming of results which will not be deterministic (which we probably shouldn't care about in general), but I added explicit sorting to two of our tests to ensure that the result data is what we expected. An unintended improvement of this PR is that
--err-first-hit
and--err-first-miss
now work like they're supposed to.I also verified that
--err-first-hit
and--err-first-miss
also work with this implementation:I also fixed up a data race condition, simplified how file reports are stored, and improved the output of
--stats
to make more sense. As part of the files map simplification, I added code to the JSON and YAML renderers to convert thesync.Map
data to a format that works with the respectiveMarshal
functions.