snuspl / cruise

Cruise: A Distributed Machine Learning Framework with Automatic System Configuration
Apache License 2.0
26 stars 2 forks source link

[CAY-1262] Prepend job Id to logs from job master in JobServer #1265

Closed wynot12 closed 6 years ago

wynot12 commented 6 years ago

Resolves #1262

This PR introduces JobLogger, which appends job id to log message to logs, and changes all master-side classes to use this JobLogger.

yunseong commented 6 years ago

Thanks a lot for the PR! I have a few high-level questions: 1) In which files are the jobs' logs stored? 2) Is the JobLogger thread-safe, which I believe Logger frameworks internally support? 3) Could you provide some examples of the log messages?

wynot12 commented 6 years ago
  1. The folder of driver's log, since master runs on driver. image

  2. Yes, definitely.

  3. driver/NMF-1.log image

wynot12 commented 6 years ago

Note that job logs are also written to driver.stderr

wynot12 commented 6 years ago

@yunseong Actually I think you don't have to understand two imported methods, but I'll describe it to you with slack call if you want.

Please let me know if you want :)

yunseong commented 6 years ago

Thanks for the answer. As you said, I don't think we need to understand the implementation, but the comments at least look very awkward and I think we can simply remove them not to confuse readers like me ;) I'll remove the comments and merge this PR. Thanks.

wynot12 commented 6 years ago

!rebuild

wynot12 commented 6 years ago

It looks that tests generate a log file to an improper place, which requires copyright in all files. I'll fix it.

wynot12 commented 6 years ago

The problem was by unit-test (WorkerStateManagerTest) which makes a log (dolphin.log) under dolphin/async/.

I fixed the problem by enabling log separation only when running an actual job with jobserver mode.

@yunseong Please take a look!

wynot12 commented 6 years ago

After experiments in YARN-environment, we've found that log files are not generated as our expectation. Maybe we're missing something in configuration, but we decide not to struggle with it and just use a script to separate all-in-one log file using with the appended job Id.

wynot12 commented 6 years ago

@yunseong I've updated the PR including the title and the description. Please take a look :)

yunseong commented 6 years ago

LGTM. Thanks for the quick fix! :)