dotnet / spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
https://dot.net/spark
MIT License
2.02k stars 312 forks source link

[FEATURE REQUEST]: Allow injection of custom LoggerService #473

Open thrixton opened 4 years ago

thrixton commented 4 years ago

It's not possible to redirect all logging to the applications own logging pipeline currently as both LoggerServiceFactory and ILoggerService are internal.

I would like to inject my own logging service (either seq or Application Insights depending on environment) and have all logs consolidated.

It seems log4net is suppoerted via config file however I have other correlation information that I need to add which I cannot see how to accomplish.

Is this possible or is there another way to achieve this?

imback82 commented 4 years ago

We could expose LoggerServiceFactory and ILoggerService as public, but let me double-confirm. Meanwhile, you can use https://github.com/aelij/IgnoresAccessChecksToGenerator to access internal classes if you are blocked.

thrixton commented 4 years ago

@imback82 I exposed this via https://github.com/aelij/IgnoresAccessChecksToGenerator as per your suggestion, and then called the static LoggerServiceFactory.SetLoggerService method, passing my custom logger, but no logs are generated. The problem appears to be that some (maybe all) logger instances are created before my application is initialized and I set the logger. Do you have any suggestions as to the best way to override logging?

imback82 commented 4 years ago

Do you set it before executing any Spark related code? The following works for me.

        public void Run(string[] args)
        {
            LoggerServiceFactory.SetLoggerService(MyLogger.s_instance);

            SparkSession spark = SparkSession
                .Builder()
                .AppName(".NET Spark SQL basic example")
                .Config("spark.some.config.option", "some-value")
                .GetOrCreate();

            spark.Stop();
        }
imback82 commented 4 years ago

By the way, this applies only to the driver side log, not the worker. So, #542 may not be a good long term solution. Integrating with log4net or something similar will be good so that you can get a consistent experience both on the driver and worker side.

thrixton commented 4 years ago

@imback82 , yes, this was a user error sorry. Race condition between creating the session vs setting the logger, oops :) Thanks

Re #542, you're right, so i'll have a look at that option. Would it be possible to replace log4net with a more modern and maintained logging framework (serilog maybe)? Log4net appears to be "dormant". I currently use Serilog pushing either to Seq or application insights which works well.

imback82 commented 4 years ago

Would it be possible to replace log4net with a more modern and maintained logging framework (serilog maybe)?

serilog should be good. Do you want to work on this? If so, I can break down work items for you. Thanks!

thrixton commented 4 years ago

Happy to implement this. If you break down the work items i'll work through them.

imback82 commented 4 years ago

Cool. Here is what I am thinking:

1) Implement ILoggerService with serilog. 2) Update SparkEnvironment to set a logger based on an environment variable. Note that the work side should also pick up this environment variable (check CreateEnvVarsForPythonFunction) and create the same logger. 3) Document how to utilize serilog under docs. 4) Possibly make this as a default logger instead of ConsoleLoggerService.

You can think of each bullet as a separate PR. Thanks in advance!

suhsteve commented 3 years ago

Is there any progress on this work ?

I think a good use case: Microsoft.Spark library + notebooks + dotnet repl. It'd be good to redirect logger messages elsewhere in order to not pollute the notebook cell outputs.

ie image

thrixton commented 3 years ago

No progress as yet sorry, I've had to redirect my efforts in the (not so) short term.
I'll see if I can find some time over the next couple of weeks.