OTRF / Security-Datasets

Re-play Security Events
MIT License
1.61k stars 239 forks source link

Adding script to generate json indexes for remote use. #59

Open ianhelle opened 2 years ago

ianhelle commented 2 years ago

Contains script (scripts/misc/create_json_index.py) to create consolidated index from yaml metadata. This lets uses pull the metadata from the repo in a single request. The script (by default creates an uncompressed JSON, and a zipped and gzipped versions). The indexes are created in ./data/.index Also adding initial index files to ./data/.index.

I'm thinking that we could add a github action to build new index files triggered by future PRs. This could auto-create a PR but we'd likely need to add one or two custom actions - e.g. https://github.com/marketplace/actions/create-pull-request

Something like this (but this would not work with forks, since it would not have permissions to push to the remote)

on:
  pull_request:
    branches: [master]

jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.10"]
    env:
      OUT_PATH: "./datasets/.index"
      IN_PATH: "./datasets"
    steps:
      - name: Build indexes
        run: python -m scripts.misc.create_json_index --input-path ${{env.IN_PATH}} --output-path ${{env.OUT_PATH}} --formats all
      - name: Check if there are changes
        id: changes
        uses: UnicornGlobal/has-changes-action@v1.0.11
      - name: Add output files to current PR
      - uses: actions/checkout@v3
        run: |
          index-updated=$( git status --short --untracked-files=no | grep "dsets-index\.json$" )
          if [ $index-updated ]
          then
            git config user.name Auto-update-index
            git config user.email <>
            git add ${{env.OUT_PATH}}/*
            git commit -m "Security datasets auto-updated index files."
            git push