omerbenamram / evtx

A Fast (and safe) parser for the Windows XML Event Log (EVTX) format
Apache License 2.0
625 stars 61 forks source link

# in JSON field name prevents import in GCP Bigquery #198

Closed olamotte closed 2 years ago

olamotte commented 2 years ago

Thanks for the effort to build this great tool, we're throwing it a forwarded log files and really appreciate the performance boost!

There's one minor step which required preprocessing for our use case, as we are loading data in Google's Bigquery.

I unfortunately don't have a build environment setup for rust atm, but it seems the responsible code is here, impacting both #attributes and #text:

image

https://github.com/omerbenamram/evtx/blob/0950198ed6c0f2381b1fc0b79c1fa4d094f638f3/src/json_output.rs#L239 https://github.com/omerbenamram/evtx/blob/0950198ed6c0f2381b1fc0b79c1fa4d094f638f3/src/json_output.rs#L321

Is there a reason i'm missing to use a special character in these two field names? It's a rather minor issue and we can run sed, but it would save some steps.

Thanks in advance!

forensicmatt commented 2 years ago

Hey @olamotte I highly recommend using the --separate-json-attributes. all element attributes are stored under the <FIELD_NAME>_attributes field name thus there are no #attributes and #text field names.

olamotte commented 2 years ago

I had missed this - Thanks a lot!