matanolabs / matano

Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
https://matano.dev
Apache License 2.0
1.46k stars 100 forks source link

Transform error with client.geo.location #24

Closed gdrapp closed 1 year ago

gdrapp commented 1 year ago

I'm creating a log source for Okta logs and am struggling to transform log data to the ECS fields client.geo.location.lat and client.geo.location.lon. With the VRL below, I consistently get the error "USER_ERROR: Failed at FindUnionVariant, likely schema issue." in the transformer Lambda. I have pretty much every other Okta log field working.

Looking at the ECS schema JSON, both lat and lon are defined as floats, so this should work.

Relevant VRL transform: .client.geo.location.lat = to_float(del(.json.client.geographicalContext.geolocation.lat)) ?? null .client.geo.location.lon = to_float(del(.json.client.geographicalContext.geolocation.lon)) ?? null

Relevant log data:

{
    "json": {
        "client": {
            "geographicalContext": {
                "city": "Ashburn",
                "country": "United States",
                "geolocation": {
                    "lat": 39.0469,
                    "lon": -77.4903
                },
                "postalCode": "20149",
                "state": "Virginia"
            }
        }
}

Any assistance identifying the issue or bug would be appreciated.

Thanks.

shaeqahmed commented 1 year ago

Hey Greg, thanks for opening an issue. I am looking into this, will try and recreate the issue locally and see where the schema is mismatching.

shaeqahmed commented 1 year ago

This error is being caused by a bug in the way we handle the float (float32) data type in Matano, since in VRL a float is a double (float64), and this causes schema mismatch issues if your type is defined as a float (float32) because its automatically upcasted during the transformation step.

I have a fix locally, that I have verified fixes your above example, that I will push out tomorrow to fix this. In the meantime you should be able to unblock yourself by typing this as a double instead of a float. Thanks

gdrapp commented 1 year ago

Appreciate you taking a look. I’m on vacation this week but will pull down your fix when I return and give it a whirl. Thanks!

shaeqahmed commented 1 year ago

https://github.com/apache/avro/commit/6eb72341912ac3858f56531e457de77213eafd02

My fix has been merged a fix in the official Avro upstream as well. Deleting and redeploying the DPMainStack using the latest release should have fixed any problems with float/double. Let me know if this resolves your issue, thanks!

gdrapp commented 1 year ago

Just attempted to test using today's (December 12th) nightly build but the DPMainStack won't deploy. Looks like this is some sort of bug because I was able to deploy on an older version. Tried deleting both the stacks and starting fresh but it didn't help, still getting same error. Not sure how to further debug this?

[100%] fail: The XML you provided was not well-formed or did not validate against our published schema

 ❌  DPMainStack (MatanoDPMainStack) failed: Error: Failed to publish one or more assets. See the error messages above for more information.
    at publishAssets (/snapshot/node_modules/aws-cdk/lib/util/asset-publishing.ts:60:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at CloudFormationDeployments.publishStackAssets (/snapshot/node_modules/aws-cdk/lib/api/cloudformation-deployments.ts:572:7)
    at CloudFormationDeployments.deployStack (/snapshot/node_modules/aws-cdk/lib/api/cloudformation-deployments.ts:419:7)
    at deployStack2 (/snapshot/node_modules/aws-cdk/lib/cdk-toolkit.ts:265:24)
    at /snapshot/node_modules/aws-cdk/lib/deploy.ts:39:11
    at run (/snapshot/node_modules/p-queue/dist/index.js:163:29)

 ❌ Deployment failed: Error: Stack Deployments Failed: Error: Failed to publish one or more assets. See the error messages above for more information.
    at deployStacks (/snapshot/node_modules/aws-cdk/lib/deploy.ts:61:11)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at CdkToolkit.deploy (/snapshot/node_modules/aws-cdk/lib/cdk-toolkit.ts:339:7)
    at initCommandLine (/snapshot/node_modules/aws-cdk/lib/cli.ts:374:12)

Stack Deployments Failed: Error: Failed to publish one or more assets. See the error messages above for more information.
shaeqahmed commented 1 year ago

Hm thanks for the report. This seems to be caused by an obscure issue with the specific Node 16 version and compatibility with AWS CDK: https://github.com/aws/aws-cdk/issues/19287. We bundle node as part of the matano CLI using vercel/pkg, I'm going to look into recreating this issue and pinning the Node to a specific minor version as a potential fix. Will update here

gdrapp commented 1 year ago

I just found a post on SO that links to the same aws-cdk issue. They agreed rolling back the node version seems to fix it.

I also looked back at my debug deploy logs and I do see a multipart upload that has two null ETags, so definitely seems similar. I'll keep an eye out for a new Matano nightly. Thanks!

Samrose-Ahmed commented 1 year ago

Can you run matano and copy the version its outputting? Should look like:

VERSION
    matano/0.0.0 linux-x64 node-v16.16.0

Also are you using Mac OS or Linux? Thx.

gdrapp commented 1 year ago

matano/0.0.0 darwin-x64 node-v16.16.0

macOS 12.6.1

Samrose-Ahmed commented 1 year ago

I wasn't able to reproduce the error, I tried it on both Linux and MacOS but was able to successfully deploy.

I've went ahead and updated the embedded node version to 18.5.0. Could you try again and see if it works?

gdrapp commented 1 year ago

Not seeing any nightlies available for download right now. Can you check the build?

Samrose-Ahmed commented 1 year ago

I've retried the build, they're generated now.

gdrapp commented 1 year ago

Same issue using matano/0.0.0 darwin-x64 node-v18.5.0. Can we try downgrading node to an earlier 16.x version?

Samrose-Ahmed commented 1 year ago

Sure let me try to publish with 16.3.0, the linked issue mentions changing to that version fixed their issue.

Samrose-Ahmed commented 1 year ago

I wasn't able to publish a binary with v16.3.0 for now but I've published a release that uses v14.18.1 which the linked post mentioned was confirmed to work. Can you try again and let me know?

gdrapp commented 1 year ago

The stack deploy debug no longer showed null etags for the multipart uploads, which is good, but it still failed to deploy the first two timed I tried, with this error:

⠹ Deploying Matano...[91%] fail: One or more of the specified parts could not be found.  The part may not have been uploaded, or the specified entity tag may not match the part's entity tag.

The third time I ran it, the deployment was successful.

It seems like it's having trouble with the multipart uploads but if you run it a few times CDK eventually gets everything uploaded and it's happy. September/October I was experimenting with Matano and didn't have these issues, so it's strange that this just started happening with later builds (didn't have time to play with it in November).

Matano version - matano/0.0.0 darwin-x64 node-v14.18.1

Someone in the AWS CDK issue linked earlier mentioned they were running 16.7 and didn't see any issues, so it might be worth trying that version if 16.3 is giving you problems.

Samrose-Ahmed commented 1 year ago

Good to see it works. Sure I'll check 16.7.

Btw are you on a stable internet connection, I would often get the new error you posted when I was on a bad internet connection which makes sense as its preventing a corrupt upload, especially for the larger assets.

gdrapp commented 1 year ago

Yeah, my internet is stable. I’ll continue to troubleshoot the CDK issue on my own and open another issue if necessary. I’ll close this issue because I was able to get the Okta float data through the Matano data pipieline, so I think we’re good.

I’d be happy to contribute my Okta work to the Matano project if you’re interested in making it a managed source, just let me know.

Thanks for your help and responsiveness solving this!

shaeqahmed commented 1 year ago

Definitely would appreciate the contribution of Okta as a managed log source. Feel free to create a PR, and thank you!