elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.42k stars 24.57k forks source link

ESQL: Add a command to parse json #104934

Open not-napoleon opened 7 months ago

not-napoleon commented 7 months ago

Description

Something similar to grok, which can operate on json strings stored as text (or keyword? potentially?) to extract fields and such.

One challenge with this will be striking a balance between having a powerful enough feature to be useful, and not embedding an entire new query language within ES|QL. Necessarily, tools like jq have quite complex query languages for manipulating json. I think for an MVP, we should focus on just extracting fields via path, similar to how grok works, then encourage folks to do further manipulation of those fields within the ES|QL pipeline.

elasticsearchmachine commented 7 months ago

Pinging @elastic/es-analytical-engine (Team:Analytics)

nik9000 commented 7 months ago

It might be nice to have some kind of "unnest" command to flip json objects into rows - kind of like mv_expand but for the whole json object. It'd be slow, but it could do some useful things.

terrancedejesus commented 3 months ago

@not-napoleon @nik9000

This would be an incredibly useful feature for threat hunting and detection in cloud data that relies on API requests and responses where they are mainly JSON. A good data source to get examples for testing this would be AWS CloudTrail. Happy to share some data or access to my cluster if needed but looking at aws.cloudtrail.request_parameters and aws.cloudtrail.response_elements would be ideal.

AWS Example Data ``` { "@timestamp": "2024-04-17T19:45:08.000Z", "agent.ephemeral_id": "69b4fa20-756a-4d41-8325-7613b13a01b2", "agent.id": "f14d530d-b7f2-4dbd-b122-28582c2a767c", "agent.name": "ip-172-31-95-103", "agent.name.text": "ip-172-31-95-103", "agent.type": "filebeat", "agent.version": "8.13.2", "aws.cloudtrail.additional_eventdata": "{SignatureVersion=SigV4, CipherSuite=TLS_AES_128_GCM_SHA256, bytesTransferredIn=416, AuthenticationMethod=AuthHeader, x-amz-id-2=qXYxJfRNV8dnBC1+KWiKnnh1CpJ9hOMEOuJZAIPfwv/YCnBmzeMg7NJRoLMSpikv5+hd4Cuu85w=, bytesTransferredOut=0}", "aws.cloudtrail.additional_eventdata.text": "{SignatureVersion=SigV4, CipherSuite=TLS_AES_128_GCM_SHA256, bytesTransferredIn=416, AuthenticationMethod=AuthHeader, x-amz-id-2=qXYxJfRNV8dnBC1+KWiKnnh1CpJ9hOMEOuJZAIPfwv/YCnBmzeMg7NJRoLMSpikv5+hd4Cuu85w=, bytesTransferredOut=0}", "aws.cloudtrail.event_category": "Management", "aws.cloudtrail.event_type": "AwsApiCall", "aws.cloudtrail.event_version": "1.09", "aws.cloudtrail.management_event": "true", "aws.cloudtrail.read_only": false, "aws.cloudtrail.recipient_account_id": "REDACTED", "aws.cloudtrail.request_id": "ECXHFXX1WBEGVPZG", "aws.cloudtrail.request_parameters": "{bucketName=stratus-red-team-bdbp-hwcbgcokmh, bucketPolicy={Version=2012-10-17, Statement=[{Action=[s3:GetObject, s3:GetBucketLocation, s3:ListBucket], Resource=[arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh/*, arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh], Effect=Allow, Principal={AWS=arn:aws:iam::193672423079:root}}]}, Host=stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com}", "aws.cloudtrail.request_parameters.text": "{bucketName=stratus-red-team-bdbp-hwcbgcokmh, bucketPolicy={Version=2012-10-17, Statement=[{Action=[s3:GetObject, s3:GetBucketLocation, s3:ListBucket], Resource=[arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh/*, arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh], Effect=Allow, Principal={AWS=arn:aws:iam::193672423079:root}}]}, Host=stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com}", "aws.cloudtrail.user_identity.access_key_id": "AKIA47CRWDCFXZ3V7UXR", "aws.cloudtrail.user_identity.arn": "arn:aws:iam::REDACTED:user/stratus", "aws.cloudtrail.user_identity.invoked_by": null, "aws.cloudtrail.user_identity.session_context.creation_date": null, "aws.cloudtrail.user_identity.session_context.mfa_authenticated": null, "aws.cloudtrail.user_identity.type": "IAMUser", "aws.cloudtrail.vpc_endpoint_id": null, "aws.s3.bucket.arn": "arn:aws:s3:::asperitas-security-logs", "aws.s3.bucket.name": "asperitas-security-logs", "aws.s3.object.key": "AWSLogs/REDACTED/CloudTrail/us-east-1/2024/04/17/REDACTED_CloudTrail_us-east-1_20240417T1950Z_uj0fZWsNh4Kv6rUO.json.gz", "cloud.account.id": "REDACTED", "cloud.region": "us-east-1", "data_stream.dataset": "aws.cloudtrail", "data_stream.namespace": "default", "data_stream.type": "logs", "ecs.version": "8.0.0", "elastic_agent.id": "f14d530d-b7f2-4dbd-b122-28582c2a767c", "elastic_agent.snapshot": false, "elastic_agent.version": "8.13.2", "event.action": "PutBucketPolicy", "event.agent_id_status": "verified", "event.created": "2024-04-17T19:47:47.689Z", "event.dataset": "aws.cloudtrail", "event.id": "5354a722-c66e-482d-9358-5007be35cc1a", "event.ingested": "2024-04-17T19:47:53.000Z", "event.kind": "event", "event.module": "aws", "event.original": "{\"eventVersion\":\"1.09\",\"userIdentity\":{\"type\":\"IAMUser\",\"principalId\":\"AIDA47CRWDCFTQGUB5FBF\",\"arn\":\"arn:aws:iam::REDACTED:user/stratus\",\"accountId\":\"REDACTED\",\"accessKeyId\":\"AKIA47CRWDCFXZ3V7UXR\",\"userName\":\"stratus\"},\"eventTime\":\"2024-04-17T19:45:08Z\",\"eventSource\":\"s3.amazonaws.com\",\"eventName\":\"PutBucketPolicy\",\"awsRegion\":\"us-east-1\",\"sourceIPAddress\":\"REDACTED\",\"userAgent\":\"[stratus-red-team_97530674-fd26-4024-b2fa-36b9815e0dbc]\",\"requestParameters\":{\"bucketPolicy\":{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":\"arn:aws:iam::193672423079:root\"},\"Action\":[\"s3:GetObject\",\"s3:GetBucketLocation\",\"s3:ListBucket\"],\"Resource\":[\"arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh/*\",\"arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh\"]}]},\"bucketName\":\"stratus-red-team-bdbp-hwcbgcokmh\",\"Host\":\"stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com\",\"policy\":\"\"},\"responseElements\":null,\"additionalEventData\":{\"SignatureVersion\":\"SigV4\",\"CipherSuite\":\"TLS_AES_128_GCM_SHA256\",\"bytesTransferredIn\":416,\"AuthenticationMethod\":\"AuthHeader\",\"x-amz-id-2\":\"qXYxJfRNV8dnBC1+KWiKnnh1CpJ9hOMEOuJZAIPfwv/YCnBmzeMg7NJRoLMSpikv5+hd4Cuu85w=\",\"bytesTransferredOut\":0},\"requestID\":\"ECXHFXX1WBEGVPZG\",\"eventID\":\"5354a722-c66e-482d-9358-5007be35cc1a\",\"readOnly\":false,\"resources\":[{\"accountId\":\"REDACTED\",\"type\":\"AWS::S3::Bucket\",\"ARN\":\"arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh\"}],\"eventType\":\"AwsApiCall\",\"managementEvent\":true,\"recipientAccountId\":\"REDACTED\",\"eventCategory\":\"Management\",\"tlsDetails\":{\"tlsVersion\":\"TLSv1.3\",\"cipherSuite\":\"TLS_AES_128_GCM_SHA256\",\"clientProvidedHostHeader\":\"stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com\"}}", "event.outcome": "success", "event.provider": "s3.amazonaws.com", "event.type": "info", "input.type": "aws-s3", "log.file.path": "https://asperitas-security-logs.s3.us-east-1.amazonaws.com/AWSLogs/REDACTED/CloudTrail/us-east-1/2024/04/17/REDACTED_CloudTrail_us-east-1_20240417T1950Z_uj0fZWsNh4Kv6rUO.json.gz", "log.offset": 36793, "related.user": "stratus", "source.address": "REDACTED", "source.as.number": 12097, "source.as.organization.name": "MASSCOM", "source.as.organization.name.text": "MASSCOM", "source.geo.city_name": "Massillon", "source.geo.continent_name": "North America", "source.geo.country_iso_code": "US", "source.geo.country_name": "United States", "source.geo.location": "POINT (REDACTED, REDACTED)", "source.geo.region_iso_code": "US-OH", "source.geo.region_name": "Ohio", "source.ip": "REDACTED", "tags": [ "aws-cloudtrail", "forwarded", "preserve_original_event" ], "tls.cipher": "TLS_AES_128_GCM_SHA256", "tls.client.server_name": "stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com", "tls.version": "1.3", "tls.version_protocol": "tls", "user.id": "AIDA47CRWDCFTQGUB5FBF", "user.name": "stratus", "user.name.text": "stratus", "user_agent.device.name": "Other", "user_agent.name": "Other", "user_agent.original": "[stratus-red-team_97530674-fd26-4024-b2fa-36b9815e0dbc]", "user_agent.original.text": "[stratus-red-team_97530674-fd26-4024-b2fa-36b9815e0dbc]", "user_agent.version": null, "bucket_policy": " bucketPolicy={Version=2012-10-17, Statement=[{Action=[s3:GetObject, s3:GetBucketLocation, s3:ListBucket], Resource=[arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh/*, arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh], Effect=Allow, Principal={AWS=arn:aws:iam::193672423079:root}}]}, Host=stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com}", "version": " bucketPolicy={Version=2012-10-17", "statement": " Statement=[{Action=[s3:GetObject", "resource": " s3:GetBucketLocation", "arn": " s3:ListBucket]", "effect": " Resource=[arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh/*", "principal": " arn:aws:s3:::stratus-red-team-bdbp-hwcbgcokmh]", "host": " Effect=Allow, Principal={AWS=arn:aws:iam::193672423079:root}}]}, Host=stratus-red-team-bdbp-hwcbgcokmh.s3.us-east-1.amazonaws.com}" } ```

Dissect and Grok are great, but fall short when data is inconsistent.

Screenshot 2024-05-29 at 5 19 39 PM

cc @tinnytintin10 @imays11

tinnytintin10 commented 3 months ago

cc @eyalkraft @tehilashn