Marusyk / grok.net

.NET implementation of the grok 📝
MIT License
287 stars 55 forks source link

Base64 content detection #58

Closed nickproud closed 1 year ago

nickproud commented 2 years ago

Hi,

I'd like to add the ability to grok base64 strings from text. I would add a pattern to detect Base64 to grok-patterns and then have a validator to run over any matches to ensure they truly were base64 encoded using something like below on each match and filtering out the ones that return false:

public static bool IsBase64String(string base64) { Span<byte> buffer = new Span<byte>(new byte[base64.Length]); return Convert.TryFromBase64String(base64, buffer , out int bytesParsed); }

As per contributing guidelines, I'm raising an issue for discussion and if approved, I'll put a PR together. Thanks :)

Marusyk commented 1 year ago

Hello @nickproud,

To detect Base64 strings you can add a custom pattern and use it like:

var custom = new Dictionary<string, string>
{
     {"BASE64", "(?=(.{4})*$)[A-Za-z0-9+/]*={0,2}$"}
};

var grok = new Grok("Basic %{BASE64:credentials}", custom);
GrokResult grokResult = grok.Parse("Basic YWRtaW46cGEkJHdvcmQ=");

Console.WriteLine($"Does my text contain base64 string: {grokResult.Any()}");

foreach (GrokItem item in grokResult)
{
    Console.WriteLine($"{item.Key} : {item.Value}");
}

Output

Does my text contain base64 string: True credentials : YWRtaW46cGEkJHdvcmQ=

Does it meet your needs?

Grok is designed to work with regular expressions. If you need to extend it, add a custom regex pattern. We're not going to add more methods to Grok.cs

nickproud commented 1 year ago

I was interested in adding it as a native pattern in Grok so you could just pass in 'BASE64' as the pattern name but your solution is great. Thanks. 👍

Marusyk commented 1 year ago

Oh sure, go ahead. You can add it here https://github.com/Marusyk/grok.net/blob/daa1c0b52b664aac8c9e4928a336737464a349ad/src/Grok.Net/grok-patterns#L1-L10