ropensci / drake

An R-focused pipeline toolkit for reproducibility and high-performance computing
https://docs.ropensci.org/drake
GNU General Public License v3.0
1.34k stars 128 forks source link

Lines from log_make mixed up #1331

Closed matthiasgomolka closed 3 years ago

matthiasgomolka commented 3 years ago

Prework

Description

First, sorry for not providing a reprex. But this issue occurs at random (to my understanding) so I can't create a reprex. Here'e the issue:

After I finally understood that log_make in drake_config() needs an existing file, I really like this feature. The logs are great as I can easily find warnings and errors even if the plan runs in parallel. Now I want to write a small function which reads the log file and extracts the lines containing warnings and errors. There are two issues which make this not super straight forward:

  1. If error or warning messages contain \n, these messages are split across severeal lines in the log. This is not a big problem since I can append all lines with an incorrect number of | to the previous line. The risk of missing some message is low.

    However, it would be great if line endings of the log file would be different from those in the logged warning and error messages. Not sure if this is possible, though.

  2. This is the real issue: Sometimes - to my understanding at random - some lines in the log file get mixed up like this:

    W999V33T | 1360 | 2020-10-08 16:33:26.256068 +0200 GMT | validation_results_6940da9f | build on an hpc worker
    
    99V33T | 1360 | 2020-10-08 16:33:26.256068 +0200 GMT | validation_results_6940da9f | unload dt_cleanW999V33T | 1360 | 2020-10-08 16:33:26.274055 +0200 GMT |  | load dt_clean_9e112ab4
    W999V33T | 1360 | 2020-10-08 16:33:26.274055 +0200 GMT |  | load cf_9e5f06d1

    There is always a blank line, and the next line contains two lines but the first few characters are missing.

    Since the log stems from an r_make() run in parallel, I assume this somehow occurs when different workers try to write to the log simultaneously. But that's just a guess.

Reproducible example

As this issue occurs at random, I could not create a reprex. Here's at least my drake_config():

drake_config(
  plan,
  parallelism = "future",
  jobs = 8L,
  keep_going = TRUE,
  memory_strategy = "autoclean",
  caching = "worker",
  log_build_times = FALSE,
  log_make = "path/to/log"
)

Expected result

The lines in the log should not be mixed up.

wlandau commented 3 years ago

Both should be fixed now in https://github.com/ropensci/drake/commit/6faed005635c682ee6544f7e16ac09987d435bd3.