criteo / babar

Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.
Apache License 2.0
125 stars 29 forks source link

Get babar log without using Yarn #23

Closed gargomeiste closed 5 years ago

gargomeiste commented 5 years ago

Hey Criteo team, I was wondering if it was possible to get babar.log without using yarn. I already tried to create a RollingFileAppender in a log4j file in order to store the output from criteo.babar package :

log4j.logger.com.criteo.babar=myConsoleAppender, Profiler
log4j.additivty.com.criteo.babar=false

This solution didn't work and I might miss something.

Thank you for the help and the cool library you developed.

BenoitHanotte commented 5 years ago

Hello @antoinegargot ! You should be able to use Babar without YARN, we use Yarn for 2 things that you can do another way:

  1. each container logs to a file in its local yarn log directory
  2. When aggregating the results with babar-processor, we parse the log file that has been aggregated by yarn from all the containers' yarn log files.

for 1., If your are not running on Yarn, your babar.log file should be written locally on each node in ./logs. You should also be able to specify the log dir in the agent parameters (but I haven't tested it) with

-javaagent:/path/to/babar-agent.jar=dir=/my/logs/path,...

(you can see https://github.com/criteo/babar/blob/master/babar-agent/src/main/java/com/criteo/babar/agent/reporter/LogReporter.java#L67)

for 2. If you have multiple logs file, you'll need to concatenate them as one large text file before aggregating them with the babar-processor. Their order does not matter. If you have only one file you can directly process it.

For log4j, we couldn't use it as it doesn't seem to be possible log inside the ShutdownHook of the JVM and make sure thatthe log4J shutdown hook that flushed to file will deterministically be called after the Babar's one that logs. We need to log at that point so that the last stack traces of the JVM on shutdown are correctly captured and reported.

Let me know if you have any issue running it this way :)

Best, Benoit

gargomeiste commented 5 years ago

Hey @BenoitHanotte,

Thanks for the useful tips, I didn't find this getLogDir when I went through the code. I will try this solution and will let you know how it goes.

Thank you again for the help and the great work you're doing.

Best, Antoine