serilog / serilog-expressions

An embeddable mini-language for filtering, enriching, and formatting Serilog events, ideal for use with JSON or XML configuration.
Apache License 2.0
193 stars 17 forks source link

Expose ExpressionTemplate members #94

Closed netcorefan1 closed 1 year ago

netcorefan1 commented 1 year ago

Is your feature request related to a problem? Please describe. I'm trying to replace the regular template with an ExpressionTemplate in this library, but I'm stuck in the token extraction. All the required members seems to be inside CompiledTemplateBlock, but they are all marked as private and therefore not accessible. May be there are some other ways to achieve the goal, but I have not been able to find any.

Describe the solution you'd like Make the members public so that libraries can easily access them and be able to customize rendering in custom sinks.

Describe alternatives you've considered I tried with new MessageTemplateParser().Parse("expressionTemplate") (there is no ExpressionTemplateParser equivalent) and although tokens are parsed, they also contains expressions detected as Text. Trying to convert these into regular property tokens fails miserably because it is nearly impossible to handle dynamic data without the original CompileXxxType.

Additional context Add any other context or screenshots about the feature request here.

nblumhardt commented 1 year ago

Thanks for dropping us a line! It's not currently a goal of this library to open up this level of detail, since doing so makes it hard for the implementation to evolve/improve over time (without breaking consumers).

Depending on how far you would like to go, though, taking a dependency on Superpower and copying the tokenizer and parser from this codebase (not a huge amount of code) would allow you to parse the templates, and manipulate the resulting ASTs 🤔 - unsure how far that would get you towards your goal, but may be enough to push it forwards.

Hope this helps, Nick

netcorefan1 commented 1 year ago

Many thanks for your support. I'm trying to see if I can at least prepare a working point, but unfortunately I got stuck right from the beginning.

Depending on how far you would like to go, though, taking a dependency on Superpower and copying the tokenizer and parser from this codebase (not a huge amount of code) would allow you to parse the templates, and manipulate the resulting ASTs 🤔 - unsure how far that would get you towards your goal, but may be enough to push it forwards.

I tried by copying the tokenizer and parser from the codebase (TokenListParser2, Tokenizer1 etc), but such files have so many dependencies. I continued the copy operation to fix missing references until I realized I would have been better off to directly reference the nuget package. Is there any particular reason for which you suggested to copy part of the source code rather than directly reference the whole library?

From what I saw, TokenizerBuilder is unable to extract tokens on its own and the whole logic (used by SE) must be implemented from scratch. I'm afraid I can't just expect to make the extraction work by simply using some identifiers. Static data can be combined with dynamic data. For example a CompiledConditional can contain several combination of elements (literal text, formatted expression etc). To make this work I need to be able to distinguish those too. I tried starting from TemplateTokenParsers, but I got hundreds of errors which I could only fix with a full clone.

As of now, the only think I can think of is using a prefixed template. Although SuperPower can help to make things easier, the extraction code will be strictly bound to the prefixed template and this means that any change done to the template also needs changes to the source code. This is not what I was hoping for.

Before your comment I also played with reflection. Although I managed to get _elements as an array the big problem comes with casting.

It's not currently a goal of this library to open up this level of detail, since doing so makes it hard for the implementation to evolve/improve over time (without breaking consumers).

Can you explain exactly in what way making some members public can break consumers? I found this a several limitation and the main reason for which this, this and many other sinks does not support expressions. I have probably misunderstood your suggested solution and I would really appreciate if you could give me more details (may be with some sample code), it would be of great benefit for people that want to support SE in their sinks. If I will ever manage to get this working, something tells me that I will also have to face with breaking changes and they will be much worse. From my point of view, some members should be exposed, at least just the primitive types that allows the user to detect the different sections and allow in this way manipulation inside the Emit event.

nblumhardt commented 1 year ago

Hi!

Types like TokenListParser, Tokenizer, TokenizerBuilder, and the others in Serilog.Expressions.ParserConstruction are from https://github.com/datalust/superpower - so you should be able to take a dependency on that package, and you'll then only need a few types from this one (the Ast and Parsing namespaces from under the Expressions and Templates namespaces, I think).

The concern around making more types public is that once a significant portion of the library is public, its implementation can no longer be improved/evolved without the risk breaking other libraries that depend on those public types.

I'm not completely against adding new public surface area - it's just that it's something of a last resort :-) ... Another way to approach this might be to think through what the overall goal is, i.e., is the functionality you're looking at kind of theming? And if so, could you do what you need by interacting with the theming infrastructure already provided by the sink? Any other angles we could think about?

Cheers! Nick

netcorefan1 commented 1 year ago

Thanks Nick for your help. I really appreciate.

Types like TokenListParser, Tokenizer, TokenizerBuilder, and the others in Serilog.Expressions.ParserConstruction are from https://github.com/datalust/superpower - so you should be able to take a dependency on that package, and you'll then only need a few types from this one (the Ast and Parsing namespaces from under the Expressions and Templates namespaces, I think).

With a combination of SuperPower I've been able to run the following with a relative small number of copied files:

public MySinkCtor() {
    var tp = new TemplateParser();
    tp.TryParse("[{@t:HH:mm:ss} {@l:u3}{#if SourceContext is not null} ({SourceContext}){#end}] {@m}\n{@x}", out var parsed, out string? error);
}

class TemplateParser
{
    readonly TemplateTokenizer _tokenizer = new();
    //readonly TemplateTokenParsers _templateTokenParsers = new();

    public bool TryParse(string template, [MaybeNullWhen(false)] out Template parsed, [MaybeNullWhen(true)] out  string   error) {
        if (template == null) throw new ArgumentNullException(nameof(template));

        var tokenList = _tokenizer.TryTokenize(template);
        if (!tokenList.HasValue)
        {
            error = tokenList.ToString();
            parsed = null;
            return false;
        }

    // TokenListParserResult<ExpressionToken, Template> result = _templateTokenParsers.TryParse(tokenList.Value);
    // if (!result.HasValue) {
    // error  = result.ToString();
    // parsed = null;
    // return false;
    // }
    // parsed = result.Value;
    error  = null;
    parsed = null;
    return true;
    }
}

Unfortunately this only allows me to get the identifiers used by SE, but I'm far away from determining how the expression tokens will look. _tokenizer is probably not even needed. It's _templateTokenParsers the section that I need. I tried to go further, by uncommenting the rest of code and although at the beginning seemed easy I quickly ended up with a semi clone of the repo with full of compilation errors. There is probably lot of unneeded stuff and types, but trying to clean is extremely difficult given the numerous dependencies each file has. I will try again to focus on the Ast and Parsing namespaces from under the Expressions and Templates namespaces, but If I remember well this is how I have done. These namespace seems to be not enough for _templateTokenParsers, but may be I have not been enough aggressive on cleaning. I'm afraid that the more code I extract and put there the more are chances of breaks when the API changes.

The concern around making more types public is that once a significant portion of the library is public, its implementation can no longer be improved/evolved without the risk breaking other libraries that depend on those public types.

Yes, a significant portion could complicate things dramatically. Do you think that exposing a simple wrapper that return primitive types could be a good compromise? The sample template above returns eight CompiledTemplate elements. The Theme section could be omitted (at least in my case I believe is unneeded). I didn't had chance to play with this (and I could be perfectly wrong), but basically, in order to customize output correctly in the emit event, the resulting text and the expression type it belongs too could be enough. May be (and I could be wrong again), a linq expression that extract such data could work.

... Another way to approach this might be to think through what the overall goal is, i.e., is the functionality you're looking at kind of theming? And if so, could you do what you need by interacting with the theming infrastructure already provided by the sink? Any other angles we could think about?

Yes! You got that right! I want to let do the rendering to Spectre.Console. If you look at the link I posted at the beginning, each object is converted to IRenderable interface, combined together and then passed to AnsiConsole.Write(RenderableCollection). Each object of the library implements this interface and this allows to interact with objects. For example I can insert the full row of the emitted log (or part of it) inside Tables, panels, grids etc. To convert an object to IRenderable I think that I first need the expression type and then its related output sections. Not sure on how much this could be feasible, but to me seemed easy and this has been the main reason that encouraged me to start until I had to discover that such data is inaccessible.

UPDATE

Your suggestion worked! The trick was to include the entire Expressions and Templates trees in order to get the project compiled and then remove files one by one. I was doing the inverse and this was the wrong approach. Without any change to the source code, SuperPower assembly reference and the already exposed members from SE package, I have been able to successfully create the parser with 34 source files. Mostly comes from the Expressions.Ast namespace. Not sure, but if I modify the source code, may be I can remove unneeded stuff and make the parser simpler, but let me see what you think about this. My main concern is API changes. How much you would consider safe going through this route? At least now I have something to work on, but let me also wait for your news since it seems you have some idea in mind for theming.

nblumhardt commented 1 year ago

Awesome! Glad to hear you made good progress! :-)

I guess on the path that you're currently on, once you've broken up a template like:

{@t}: {@m}

into @t (expression), : (text), and @m (expression), you should be able to use Serilog.Expressions exported "compile"/"eval" functions to get at the resulting values based on any given log event. You might actually be able to do this using just the tokenizer, if you're careful to look out for nested templates 🤔

For the theme based approach, I think I'm off in the wrong direction/at a dead end. I think you'll get better results and end up with more maintainable code if you split the template into the pieces you're interested in, and then use the existing Serilog.Expressions functionality to actually execute pieces of the template as needed.

I'm personally open to considering API additions to make this scenario easier, but because the range of situations we're currently considering is pretty narrow (Spectre integration specifically), and it's not 100% clear yet how that scenario will come out, I think the best approach would be to get something working (with duplicated code if necessary), and see how it all hangs together in real-world use. Once that's had some time to bake out in the wild, we'd have more information to base a discussion about public APIs on.

Hope this makes sense!

If you're working in a public repo/fork anywhere, please do let me know, I'd be interested to follow along if so :-) - Cheers!

netcorefan1 commented 1 year ago

Awesome! Glad to hear you made good progress! :-)

Many thanks @nblumhardt ! Without your support I would have never been able to start.

I guess on the path that you're currently on, once you've broken up a template like:

{@t}: {@m}

into @t (expression), : (text), and @m (expression), you should be able to use Serilog.Expressions exported "compile"/"eval" functions to get at the resulting values based on any given log event. You might actually be able to do this using just the tokenizer, if you're careful to look out for nested templates 🤔

This is a great suggestion, but for now I think it's better to work on the same SE source code without any change in order to make things simpler. Than, once I come up with something working we can think to such optimizations.

In the meantime there are a couple of things that are slowing down my progress and may be you can give me the right tips.

  1. I'm new in the Spectre world, but from what I learned, styles can target specific fraction of text segments by using a markup language or the whole text segment by using a Style class. The first one requires more attention. The API identifies section to style by looking for the presence of an opening and closing tag like for example [red]text to style[/]. It also support nested styles like [red]text [green bold]to[/] style[/] which will result in text and style printed in red and to in green bold. To detect markups I use their MarkupTokenizer.cs and this allows me to correctly style what ever section of text I want as long as it supports IFormatProvider by simply auto escaping markup tokens through '. For example, the following {@t:[red]HH[/]:mm:ss} prints the hour section of the timestamp in red. Now, by looking at SE features I see that arrays, functions and other stuff also uses [] and I'm wondering if this can cause parsing failures. Do you have some recommendations? Is there some particular escaping mechanism in SE that I should be aware of?

  2. For the theme based approach, I think I'm off in the wrong direction/at a dead end. I think you'll get better results and end up with more maintainable code if you split the template into the pieces you're interested in, and then use the existing Serilog.Expressions functionality to actually execute pieces of the template as needed.

    I'm a bit concerned about performance. Each time a section of text is styled a new Markup instance is created. Since SE already supports basic styling, I'm wondering if could be a good idea to split rendering and let SE to do the basic styling and to spectre all the advanced stuff. Although I believe that it will not make big difference (after all, each time data changes, a string needs to be recreated in both the cases) I would like to know what you are your thougths.

I'm personally open to considering API additions to make this scenario easier, but because the range of situations we're currently considering is pretty narrow (Spectre integration specifically), and it's not 100% clear yet how that scenario will come out, I think the best approach would be to get something working (with duplicated code if necessary), and see how it all hangs together in real-world use. Once that's had some time to bake out in the wild, we'd have more information to base a discussion about public APIs on. If you're working in a public repo/fork anywhere, please do let me know, I'd be interested to follow along if so :-) - Cheers!

As of now I don't have any idea on how I will end up. It's still soon, but I will send you a sample project as soon as I come up with something usable. As of now, 80% of the code is just fragments of useless testing code here and there, comments etc, so I wanted to save others from seeing my garbage online. Let see how and if will evolve.

Edit

Development is going to become pretty complicated, but I'm doing progresses anyway. Right now I'm working on User-defined functions (and I suppose the same code should work for builtin functions too) and I would like to ask you one thing on the matter. In order to get data, I use the following code:

var expr = SerilogExpression.Compile(fe.Expression.ToString(), nameResolver: NameResolver);
var data  = expr(logEvent);

The ToString() method in abstract class Expression says: // Used only as an enabler for testing and debugging. Should I use a different method?

netcorefan1 commented 1 year ago

Sorry for stressing you again. I made progresses and I'm not very far to completion, but now I'm stuck with an issue with the Repetition template that is driving me crazy.

var values = new[] { "val1", "val2", };
var logger = Logger = new LoggerConfiguration()
                      .Enrich.WithProperty("MyProp", values, true)
                      .WriteTo.Console(new ExpressionTemplate("{#each s in MyProp}=> {s}{#delimit} {#end}")
                      .CreateLogger();

The template has a Body property which I suppose is always a TemplateBlock containing the elements. In the above case it contains a LiteralText and an AmbientNameExpression where the property name is s. I'm unable to get the value of s because SerilogExpression.Compile("s") always returns null. As of now, the only solution I found is to manually process the Enumerable CompiledExpression and combine all the elements of the Body template block. It seems to be more of an hack which also causes so many new allocations. Is s supposed to return a value or is just a temporary dummy property?

nblumhardt commented 1 year ago

Hi! That AmbientNameExpression should be resolved into a ParameterExpression before it's compiled/run; I'm a bit hazy on where that happens but short on time to dig in - just posting this in case it's enough of a clue :-) Cheers!

netcorefan1 commented 1 year ago

Many thanks, friend! This tip was the only thing I needed. Just today I hard coded the SequenceValue and DictionaryValue and although immediately popped up bugs with just a few changes to the expression template, what got me really stuck was the StructuredValue part. This one is much more involved. I hope that this tip can get me one step forward (especially on the StructuredValue side). Unlike the other templates, the repeater is the hardest one where I lost most of the time. If I can complete it, I believe that all of the rest should be fast and straightforward. Hope to send you a sample project very soon. Will try to disturb you the less as I can, unless is something serious that I can't manage to handle myself. Thanks again

netcorefan1 commented 1 year ago

Mhhh.... I'm probably doing it in the wrong way:

class AmbientNameExpressionRenderer {
    CompiledExpression  ce1;
    CompiledExpression  ce2;
    internal AmbientNameExpressionRenderer(AmbientNameExpression ane, string? format)
    {
        ce1 = SerilogExpression.Compile(new ParameterExpression("s").ToString());
        ce2 = SerilogExpression.Compile("s");
    }

   public IEnumerable<IRenderable> Render(LogEvent logEvent) {
       var val = ce1(logEvent); 
       val  = ce2(logEvent); 
    }
}

Syntax error (line 1, column 1): unexpected $ when trying to compile ParameterExpression (ce1). It only compiles if I pass the straight property name (ce2), but then, when I invoke, I get null. AmbientNameExpressionRenderer ctor runs inside my sink ctor while Render is executed when I call Log.information(....).

netcorefan1 commented 1 year ago

Nope. Unless you give me more details on how I should use the ParameterExpression I'm afraid I will not be able to handle this myself. In a working debug session, it is not even instantiated. This seems to be used in lambda expressions and after struggling my head for days for not being able to make it work, I'm now wondering if you misunderstood my sample code which is the same of the home page (=> is not a lambda expression, it's just literal text).

The sample above prints => val1 => val2. I need to access that data in a structural way. The raw api doesn't recognize s, the whole expression or even part of it. I also tried playing with the Compiled expressions and although I can see how each expression is evaluated, I am unable to access the data. There's no method that allows to returns the structured data (may be this could be the occasion for an enhancement request?).

I'm afraid that if accessing the data in the other templates is so problematic, I will have to decide if abort the project or get it done with limited support.