export to ES not reliable

lcia-projects commented 1 year ago

i've got a huge amount of event lots to process with Zircolite and put into elasticsearch. its an amazing tool, but the elasticsearch connector/exporter seems to lock up on me more often than not.

this is the command i'm running: python3 zircolite.py --debug --evtx //Windows/system32/winevt --ruleset rules/rules_windows_generic.json --remote http://:9200 --index

there are no errors, it just hangs and stops submitting records. any suggestions would be appreciated.

(really do appreciate the tool)

lcia-projects commented 1 year ago

it seems as though small event log folders work fine, but large sets of event logs, i lose connection with my es server and things just lock up)

it also seems it is sending event logs one at a time. Have you thought or looking into the bulk submission api call/bulk helper?

wagga40 commented 1 year ago

Hello, ES support has always been difficult and little bit experimental, there are some warnings about that in the docs. Compared to Splunk I had a lot of problems some are related to field mapping/typing. But to be honest I do not use ES a lot (if not at all).

You are right I do not use bulk submission api, the reason is that the main goal was to stream detections. But I can look into it.

lcia-projects commented 1 year ago

thanks, i'll continue to play with it. ES is difficult.. but its free. i'll see if i can help alittle. thank you for your response.

wagga40 commented 1 year ago

Hello,

did you try importing manually using the templating option of Zircolite ?

python3 zircolite_dev.py -e ./EVTX-ATTACK-SAMPLES/ -r rules/rules_windows_sysmon_full.json --template templates/exportForELK.tmpl --templateOutput elk.json
You import the elk.json file into ELK with the "Upload a file" functionality

Worked perfectly for me even with pretty big datasets (Up to 1GB, you need to change max upload size in Kibana advanced settings).

lcia-projects commented 1 year ago

thank you for your help @wagga40, let me know if i'm being a pest.

your suggestion worked, I'm using the template and saving to a file. i then read in the json file, add a few more fields for house keeping (case # and organization name) to keep things organized)

i questions:

when I imported using the es host/streaming arguments I got fields for tags, rule_level, sigma_file, those fields were useful, why is there a difference in output?

lcia-projects commented 1 year ago

this is my code i'm using to process the export and bulk submit to ES: (you're welcome to judge my ugly python code) it submits 1000 records at a time, then a final submit with whatever is left.

from elasticsearch import Elasticsearch, helpers

    def processzircoliteELKJSON(self, filename, case, organization):
        jsonData=[]
        print (filename, case, organization)

        if os.path.isfile(filename):
            print ("Filename found:", filename)
            with open(filename) as elkJSONFile:
                rawData=elkJSONFile.readlines()

            for item in rawData:
                jsonItem=json.loads(item)
                jsonItem['case']=case
                jsonItem['organization']=organization
                #print(jsonItem)
                jsonData.append(jsonItem.copy())

        print ("Number of Submissions:", len(jsonData))
        self.bulkESSubmit(jsonData)

    def bulkESSubmit(self, allLogItems):
        count=1
        es = Elasticsearch([self.ES_server])
        actions=[]
        for item in allLogItems:
            jsonData=json.dumps(item)
            actions.append(jsonData)
            count +=1
            if (count %1000)==0:
                print ("Bulk Submit:", count)
                helpers.bulk(es, actions, index=self.indexName)
                actions.clear()

        print("Last Submit:", count)
        helpers.bulk(es, actions, index=self.indexName)
        actions.clear() #

wagga40 commented 1 year ago

Sorry for the delay.

when I imported using the es host/streaming arguments I got fields for tags, rule_level, sigma_file, those fields were useful, why is there a difference in output?

Because I did not add them in the template (exportForELK.tmpl) sorry 😅😅😅 you can modify the template (it is in Jinja format) yourself :

{% for elem in data %}{% for match in elem["matches"] %}{"title":{{ elem["title"]|tojson }},"id":{{ elem["id"]|tojson }},"rule_level":{{ elem["rule_level"]|tojson }},"sigmafile":{{ elem["sigmafile"]|tojson }},"description":{{ elem["description"]|tojson }},"description":{{ elem["description"]|tojson }},{% for key, value in match.items() %}"{{ key }}":{{ value|tojson }}{{ "," if not loop.last }}{% endfor %}}
{% endfor %}{% endfor %}

I will update the repo accordingly.

lcia-projects commented 1 year ago

no problem at all, i really do appreciate this project and what you do. its saved me an ENORMOUS amount of time with a very large case/data set.. and helped me catch/notice some amazing things in eventlogs I wouldn't have found as easily on my own.

the template code you supplied above, is this the updated code needed to get those fields? or do I need to update this line with the fields I want?

thank you

wagga40 commented 1 year ago

Yes I’ve added the tags, rule_level and sigma_file fields.

lcia-projects commented 1 year ago

processed 1.1tb of eventlogs from ~160 servers over the last 3 days. with the approach above (template -> bulk submit to es via python) everything worked great. thank you for your assistance.

wagga40 / Zircolite

export to ES not reliable #55