OTRF / Security-Datasets

Re-play Security Events
MIT License
1.61k stars 239 forks source link

simple way to upload documents to ES #2

Closed yugoslavskiy closed 4 years ago

yugoslavskiy commented 5 years ago

Hello guys.

We with @AverageS used another way to upload index to our Demo Dashboard. Would be great if you will add it to your docs.

Here is it:

  1. Go to small_datasets directory:
cd ./small_datasets
  1. Untar all files to this directory:
find . -name '*.tar.gz' -exec tar -xzf {} \;
  1. Upload it using python script:
import elasticsearch
import json
import os

es_url = "http://<es_ip/domain>:<es_port>"
es_user = ""
es_pass = ""
index_name = ""
_doc_type = ""

es = elasticsearch.Elasticsearch([es_url],http_auth=(es_user, es_pass))

for i in os.listdir():
  if not i.endswith(".json"):
      continue
    with open(i) as f:
      test = []
      for line in f.readlines():
        test.append(json.loads(line))

    for x in test:
      res = es.index(index=index_name, doc_type=_doc_type,body=x)
      print(res['result'])
Cyb3rWard0g commented 5 years ago

Thank you very much for sharing @yugoslavskiy ! I will add it as another way to consume the data. I will have an option in the wiki for "other suggestions" and I will make sure this goes there. Thank you for contributing 👍 Best of luck with your project too guys! Have a nice weekend!

yugoslavskiy commented 5 years ago

Thank you for your project and good luck with your one too, guys!

BTW, I have one thing related to MITRE ATT&CK to discuss with you, @Cyb3rWard0g . I couldn't find your email, so I've followed you on LinkedIn instead. We will meet each other on x33fcon in Gdansk (and later in Brussels on EU MITRE ATT&CK Workshop), and I just wanted to share some materials with you to discuss it offline next week.

So please text me back in LinkedIn when you will be available. Thank you again!

See you!

duttad commented 5 years ago

mordor is an awesome project and I like the idea of having a simple way to load the data to ES (@yugoslavskiy ). An improvement that I would like to see is sending each log (i.e. each line in the json file) to a different index based on the source_name field. Now I can possibly do that by having a config file that maps a source_name to an ES index and make the python script follow that config file for writing to ES. But I was also wondering if logstash is more suitable for the job. Has anyone reading this issue thread tried using logstash to parse and dump the data in ES (preferably to different indices for different source_name? ) (Basically I am a newbie in logstash and looking for code example :-) In our organization we are using logstash to send these logs to places other than ES as well - so eventually I will have to use logstash anyway. Thanks for any code snippet!

hxnoyd commented 4 years ago

@duttad I'm using logstash json codec to load mordor data sets in ES. The input code looks like this

input{
  file{
    codec => json
    sincedb_path => "/dev/null"
    path => "/usr/share/logstash/datasets/credential_access/*.json"
    start_position => "beginning"
    tags => [ "mordor", "small_dataset" ]
  }
}

It's advisable that you keep some of the HELK scripts, specially the ones that upload the templates. If so, ensure that you also use HELK's pipeline.