rust-secure-code / wg

Coordination repository for the Secure Code Working Group
151 stars 10 forks source link

Status/Report/Analytics Opt-In Automation #43

Open pinkforest opened 2 years ago

pinkforest commented 2 years ago

Why? - or - The Target Problem(s) Statement

  1. Knowing what the binary was compiled with (transitive dependencies) in the past is hard for both the maintainer as well as the user in order to determine whether the transient dependencies are, is hard unless one happened to use auditable - esp crate maintainer.
  2. Everyone keeps using separately computing power to run various cargo commands e.g. geiger which report typically does not change if code is not changed / dependencies stay the same
  3. Any centralised solution that is supposed to help ecosystem with visibility (e.g. geiger.rs) requires running everything in central place

What? - or - The Solution Proposal

I essentially would like to provide a tool in Opt-In basis that feeds the output or reporting from given time / associated involved crate versions into a repository that we can share - e.g. userbenchmarks or geekbench and the like.

This would include people running cargo geiger, benchmarks or tests and so.

Usage could be like with a pipe - cargo geiger | piping-to-analytics-tool

I already have a dynamic parser-analysis-stats tool that takes in dynamic templates and then combines that with a repository of those to generate more usable output in combined form with statistics exposed via web service.

Of course the reports are stored in fulltext form so those can be re-parsed as the tooling evolves - whilst keeping the hooks still in the past.

Templates for the report parsing could live in a github repository from where we look for the correct template by the command used.

Benefits

This would enable everyone to track what crate versions were pulled with cargo build | cargo report and create statistics around user builds since cargo tends to use the greatest semver compatible version off manifest -

This would help with the problem determining when was the build timestamp point in the history where the cargo was still picking up say insecure transient dependency that could have been affected by advisory and potentially helping the risk assesment on determining when and whether the crate really is picking up insecure transient dependencies.

It could also provide valuable data for the maintainers of the crate around build / test but this is outside purview of the security specific use .. but whatever it will increase visibility which is always good I guess?

Privacy

The tool acts in least information basis and all the capture is selective basis - however the collection may inadvertently capture private information into analytics store which we address in the templates to reduce any chances of this happening.

Any information exposed publicly is selectively calculated from given outputs.

Refs