matanolabs / matano

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
https://matano.dev
Apache License 2.0
1.46k stars 100 forks source link

enhancement: CSV file ingest support #38

Closed kai-ten closed 1 year ago

kai-ten commented 1 year ago

The real deal this time, tested with example enrichment data and works with the following sample config:

log_source.yml

name: "employee"

schema:
  fields:
    - name: "name"
      type: "string"
    - name: "emp_email"
      type: "string"
   - name: "department"
      type: "string"

transform: |
  .name = del(.json.name)
  .emp_email = del(.json.emp_email)
  .department = del(.json.department)

And sample CSV data:

name,emp_email,department
Jane Doe,jane@company.com,Executive
Joe Smith,joe@company.com,Finance

Converting the CSV to NDJSON/JSONL also works as expected.

Any other thoughts / recommendations?

Considering adding to the docs / adding a CSV example with sample data in the /examples section of the code, but want to be sure that it is added in a place that you guys would also align with. Just let me know

Samrose-Ahmed commented 1 year ago

LGTM! Awesome contribution.

Will merge.

We can definitely update the docs, we moved docs to a separate repo at https://github.com/matanolabs/matano-website.