Closed henrikjohansen closed 5 years ago
One possible syntax is to have one function able to either produce the hashes (for the parser step) or match (for the search step) depending on the parameters supplied.
To replace the ssn field with the hashed value of the same field in a parser:
sha(field=ssn, salt="salt1", as="ssn")
// The same but using the shorthand for as:
ssn := sha(field=ssn, salt="salt1")
To match the hash of the text "12345678" against the value in the ssn in the event in a search:
sha(field="ssn", input="12345678", salt="salt1")
// The same but using the shorthand for field and unnamed field:
ssn =~ sha("12345678", salt="salt1")
// The same but using the shorthand for the unnamed field and searching @rawstring:
... | sha("12345678", salt="salt1") | ...
input
and as
are mutually exclusive:
as
is provided the function calculates and store the hash of a field in a field.input
is provided the function calculates the hash of the input and filters events on that.The salt names a system wide salt string that is included in the hashes to make hashes harder to brute force. The salt is some random string kept secret by the system, selected by the salt parameter.
Other options: algorithm
to select sha256, sha512 or others. encoding/format
to select hex, base64, or others, or to select to output only the leading k bits.
:point_up: is excellent. One small proposal would be to allow for salts to either be specified by the operator or randomly generated by Humio.
This would allow for easier integration with other tools and also allow for generation of hashes in other parts of the ingest pipeline or perhaps even directly on the source systems.
Likewise, auto generated salts should be in global for backup purposes.
I plan to make salts get auto-generated for the first version of this: Humio will make up a 256 bit secure random bitstring and store it in global when you refer to an unknown salt name in a parser context.
Later we can make it possible to store a specific bitstring into global as a salt for use when hashing happens in the source systems. But along with that comes the need to be able to format in whatever format the external hashers might use, where the first version only needs to know the format Humio uses by default.
Sounds great ... looking forward to 1.6.2 :innocent:
:rofl:
It's part of upcoming 1.6.2 using the hashRewrite
and hashMatch
functions.
I would love to have the ability to generate SHA256 hashes by combining the content of a field plus a static salt (the salt should be kept as secret as possible and thus should be pulled from Humios config).
This would enable us to replace all SSN numbers or other sensitive / GDPR related material (names, emails, ip adresses, etc) with hashes and thus avoid storing them in plain text while keeping the ability to search for a specific one. Many analysis would still work using this methodolody, like
groupby()
, even without hashing a string before searching.In parsers one could generate a hash and either overwrite the source field, or generate a new field and drop the original one.
A query could look like this :
ssn = hash("1234561234", salt="ssn")
Discussed with Morten Grouleff.