openraven / magpie

A Cloud Security Posture Manager or CSPM with a focus on security analysis for the modern cloud stack and a focus on the emerging threat landscape such as cloud ransomware and supply chain attacks.
Apache License 2.0
168 stars 25 forks source link

JSON and CSV Output #201

Closed kickroot closed 3 years ago

kickroot commented 3 years ago

We need more than just a text report output in magpie-policy. CSV and JSON outputs are requested. This will be broken down into two parts:

1) io.openraven.magpie.core.cspm.services.ReportService should be moved into the magpie-api project and renamed PolicyOutputPlugin. 2) The existing ReportServiceImpl should be renamed TextPolicyOutputPlugin. 3) New plugins for CSV and JSON should be created 4) Implement a ServiceLoader as we have done for magpie-discovery allowing jar file implementations to be dropped in the classpath and discovered at runtime. 5) Build configuration loading as done with the discovery plugins so users can enable/disable/configure policy output plugins.

belosh59 commented 3 years ago

With ScanResults we would also need to include PolicyContext, Violation, IgnoredReason into the magpie-api, which will change abstract nature of the api package itself

public class ScanResults {
  private List<PolicyContext> policies = List.of();
  private Map<PolicyContext, List<Violation>> violations = Map.of();
  private Map<PolicyContext, Map<Rule, IgnoredReason>> ignoredRules = Map.of();
  private int numOfViolations;

In order to challenge the proposed design in description: By looking into magpie-api usage we could conclude that layered designs with plugins is purely applicable magpie-discovery. On the other hand magpie-policy

From that point of view there 2 outcomes:

policies:
  root: ~/.magpie/policies
  repositories:
    - https://github.com/openraven/aws-cis-1.2.git
  analysis.output:
    stdout.enabled: true
    csv:
      source: ~/.magpie/analysis.csv
      enabled: true
    json:
      source: ~/.magpie/analysis.json
      enabled: false
belosh59 commented 3 years ago

Does DMAP should also follow that kind of configured output approach?

belosh59 commented 3 years ago

JSON output example: out.txt

belosh59 commented 3 years ago

CSV output result: Parsed CSV : https://docs.google.com/spreadsheets/d/1Abwtr-I9FbNXDfZWVzeaiXJUkFXJocWfO0Ue9R2O3DY/edit?usp=sharing Plain CSV: https://drive.google.com/file/d/1hrw4ZCHSIo8E4XwaWECqXAtSt62WQGs0/view?usp=sharing

kickroot commented 3 years ago

Good work, but I think I should have added a few requirements to the CSV output. The above examples aren't easily parsable by CSV readers due to the various header sections. We need a strict column/row format that can be read by other tools.

Think something more like a database table, even if that means duplication of values across the various rows.

belosh59 commented 3 years ago

@kickroot Prepared changes to CSV violations output: Plain csv: https://drive.google.com/file/d/1LWkyC7F2Crp8XhgVt_N6dcArxAaIZ4xa/view?usp=sharing Parsed https://docs.google.com/spreadsheets/d/1Yx4vbrigxw5Yuf5gyIwRyG_hcxmXYm-_34oqcfbKWnA/edit?usp=sharing

So far looks like that. We could add additional information if required:

Screenshot 2021-08-17 at 17 27 59

By the way changed last column name to : Ignored Reason

kickroot commented 3 years ago

Looks much better! Can we get columns for the ruleId and policyId as well? It'll help with tool automation downstream from this.

belosh59 commented 3 years ago

Definitely