rcoh / angle-grinder

Slice and dice logs on the command line
MIT License
3.5k stars 72 forks source link

Parsing embedded JSON #64

Open geekscrapy opened 5 years ago

geekscrapy commented 5 years ago

Hi,

I have the following JSON format:

{"dst_host": "159.65.224.130", "dst_port": 23, "honeycred": false, "local_time": "2019-01-23 11:57:11.834296", "logdata": {"PASSWORD": "1111", "USERNAME": "root"}, "logtype": 6001, "node_id": "opencanary-1", "src_host": "41.139.253.2", "src_port": 36653}

How would I get agrind to also parse the logdata field?

TIA!

rcoh commented 5 years ago

Nested JSON isn't super well supported right now, but try: a agrind '* | json | json from logdata:

echo '{"dst_host": "159.65.224.130", "dst_port": 23, "honeycred": false, "local_time": "2019-01-23 11:57:11.834296", "logdata": {"PASSWORD": "1111", "USERNAME": "root"}, "logtype": 6001, "node_id": "opencanary-1", "src_host": "41.139.253.2", "src_port": 36653}' 
| agrind '* | json | json from logdata | fields PASSWORD, USERNAME'
[PASSWORD=1111]            [USERNAME=root]

You can use the fields operator to drop the other fields if you want.

geekscrapy commented 5 years ago

Awesome, thanks!

Appreciate it's not trivial, however the main usecase for me would be to read JSON converted EVTX (windows logs) where there may (or may not be) multiple levels of json... Would be killer if that would be supported

rcoh commented 5 years ago

Yeah, makes sense. I assume you don't know the keys ahead of time? What would the ideal workflow be for you? Something like:

* | json | count by logdata.user?

rcoh commented 5 years ago

Keeping the issue open to discuss longer term improvements

geekscrapy commented 5 years ago

Yea, the keys relate to the event id (and there are thousands of event ids....).

Btw you probably know, but that example JSON I gave earlier is a random log I had floating around, it's not a Windows EVTX. I could provide you one if you wanted (or you can just copy one from a Windows machine).

The original format of EVTX is binary XML, which is usually extracted using the following python library. This XML is usually then converted to a python dict, then to JSON format. So it's a bit of a pain, but if you could shortcut the conversion process, that'd be a massive win. There are very few tools that allow analysis of EVTX on Linux. But maybe there is a good reason for this and I'm missing it 😂

https://github.com/williballenthin/python-evtx/issues/47

rcoh commented 5 years ago

yeah I'm probably not going to at specific EVTX support...I could add support to redirect the pipeline to another program though

On Sun, Jan 27, 2019 at 2:56 PM molley notifications@github.com wrote:

Yea, the keys relate to the event id (and there are thousands of event ids....).

Btw you probably know, but that example JSON I gave earlier is a random log I had floating around, it's not a Windows EVTX. I could provide you one if you wanted (or you can just copy one from a Windows machine).

The original format of EVTX is binary XML, which is usually extracted using the following python library. This XML is usually then converted to a python dict, then to JSON format. So it's a bit of a pain, but if you could shortcut the conversion process, that'd be a massive win. There are very few tools that allow analysis of EVTX on Linux. But maybe there is a good reason for this and I'm missing it 😂

williballenthin/python-evtx#47 https://github.com/williballenthin/python-evtx/issues/47

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/rcoh/angle-grinder/issues/64#issuecomment-457962271, or mute the thread https://github.com/notifications/unsubscribe-auth/AAeFZyzqYnbKCJM24YNXGBfhvTZhpYiWks5vHi6hgaJpZM4aUn6l .

geekscrapy commented 5 years ago

No that's fine, I was more hoping for either nested XML or JSON. We only really need one of those and it'll be covered.

Adding an option to redirect seems a little more complicated than it needs to be I think. cat works fine 😀

rcoh commented 5 years ago

You might also try JQ to do whatever complex JSON munging you want to do before piping the result to angle-grinder

On Sun, Jan 27, 2019 at 10:58 PM molley notifications@github.com wrote:

No that's fine, I was more hoping for either nested XML or JSON. We only really need one of those and it'll be covered.

Adding an option to redirect seems a little more complicated than it needs to be I think. cat works fine 😀

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/rcoh/angle-grinder/issues/64#issuecomment-458017664, or mute the thread https://github.com/notifications/unsubscribe-auth/AAeFZ5SUVHoGVeD5XGVNnU06dHQo-Qbdks5vHp-agaJpZM4aUn6l .

rcoh commented 5 years ago

I think a good solution for this could be splatting JSON objects into 1-row-per KV like https://github.com/tomnomnom/gron -- @geekscrapy I'm curious if that would create a usable output for EVTX JSON

geekscrapy commented 5 years ago

Hey! This may work (disclaimer: I've not played with gron, but it sounds like it should work)

rcoh commented 5 years ago

With #73 adding a splat operator should be all that's required to get decent support for arbitrary nested structures.

Arch-vile commented 2 years ago

I tried running the example you gave earlier (dropped some fields here for clarity): echo '{"dst_host": "159.65.224.130", "logdata": {"PASSWORD": "1111", "USERNAME": "root"} }' | agrind '* | json | json from logdata | fields PASSWORD, USERNAME'

But end up with: error: Expected string, found other

agrind --version
ag 0.18.0
rcoh commented 2 years ago

ah, that was probably written before proper nested field support was added.

You can now refer to logdata.PASSWORD and logdata.USERNAME directly. If you want to restructure things, you can do it like this: * | json | logdata.PASSWORD as password | logdata.USERNAME as username | fields username, password:

echo '{"dst_host": "159.65.224.130", "logdata": {"PASSWORD": "1111", "USERNAME": "root"} }' | agrind '* | json | logdata.PASSWORD as password | logdata.USERNAME as username | fields username, password'
[password=1111]            [username=root]