rakutentech / kafka-firehose-nozzle

Forward logs from the Cloud Foundry Firehose to Apache Kafka
MIT License
13 stars 8 forks source link

kafka logMessage messages appear encrypted #27

Open webbbret opened 5 years ago

webbbret commented 5 years ago

When I run cf nozzle --debug I can see logMessage messages, however, through Kafdrop, I see the following as an example:

{ "origin": "rep", "eventType": 5, "timestamp": 1546905002070602200, "deployment": "cf", "job": "diego_cell", "index": "63eee8ae-6215-48de-93eb-e70d6f43b622", "ip": "100.12.84.43", "tags": { "source_id": "9486d561-550c-46d0-910f-4f9ba01b3e2d" }, "logMessage": { "message": "V0FSTnxbMjAxOS0wMS0wNyAyMzo1MDowMl18by5hLmsuYy5jLmkuQ29uc3VtZXJDb29yZGluYXRvcnxbY29udGV4dD1kZWZhdWx0XSBbdGhyZWFkPXBvb2wtMS10aHJlYWQtMV18IFtDb25zdW1lciBjbGllbnRJZD1jb25zdW1lci01LCBncm91cElkPXN1YnNjcmlwdGlvbl0gQXN5bmNocm9ub3VzIGF1dG8tY29tbWl0IG9mIG9mZnNldHMge2hhbGxtYXJrX2FsZXJ0X291dC0wPU9mZnNldEFuZE1ldGFkYXRhe29mZnNldD0wLCBtZXRhZGF0YT0nJ319IGZhaWxlZDogQ29tbWl0IGNhbm5vdCBiZSBjb21wbGV0ZWQgc2luY2UgdGhlIGdyb3VwIGhhcyBhbHJlYWR5IHJlYmFsYW5jZWQgYW5kIGFzc2lnbmVkIHRoZSBwYXJ0aXRpb25zIHRvIGFub3RoZXIgbWVtYmVyLiBUaGlzIG1lYW5zIHRoYXQgdGhlIHRpbWUgYmV0d2VlbiBzdWJzZXF1ZW50IGNhbGxzIHRvIHBvbGwoKSB3YXMgbG9uZ2VyIHRoYW4gdGhlIGNvbmZpZ3VyZWQgbWF4LnBvbGwuaW50ZXJ2YWwubXMsIHdoaWNoIHR5cGljYWxseSBpbXBsaWVzIHRoYXQgdGhlIHBvbGwgbG9vcCBpcyBzcGVuZGluZyB0b28gbXVjaCB0aW1lIG1lc3NhZ2UgcHJvY2Vzc2luZy4gWW91IGNhbiBhZGRyZXNzIHRoaXMgZWl0aGVyIGJ5IGluY3JlYXNpbmcgdGhlIHNlc3Npb24gdGltZW91dCBvciBieSByZWR1Y2luZyB0aGUgbWF4aW11bSBzaXplIG9mIGJhdGNoZXMgcmV0dXJuZWQgaW4gcG9sbCgpIHdpdGggbWF4LnBvbGwucmVjb3Jkcy4=", "message_type": 1, "timestamp": 1546905002070602200, "app_id": "9486d561-550c-46d0-910f-4f9ba01b3e2d", "source_type": "APP/PROC/WEB", "source_instance": "0" } }

Is the nozzle encrypting the message by any chance? Want to rule it out as the culprit. Thanks in advance.

webbbret commented 5 years ago

Turns out that the nozzle, particularly the mailru package was base64 encoding the data. Went into the writer.go script in vendor/github.com/mailru/jwriter and remarked out the following two lines of code:

//dst := make([]byte, base64.StdEncoding.EncodedLen(len(data))) //base64.StdEncoding.Encode(dst, data)

I also had to remark out the import of the encoding/base64 at the top of the script. After redeploying, the logMessage messages were no longer base64 encoded.

I would like to propose that someone make this a conig.toml setting that turns base64 on and off if possible. Maybe this is a easyjson request. Many thanks in advance.

kelbyloden commented 4 years ago

I ran into this as well. I wanted to set up filters in Kafka based on fields within the log message but since it was base64 encoded I couldn't and so I also needed to disable base64 encoding. Doing that caused other problems however since the data could include invalid JSON characters such as non-printables or a double-quote character. To get around this I had to write my own method to cleanse the data (which I can share if desired). So just beware if taking this approach.