NeoGeographyToolkit / StereoPipeline

The NASA Ames Stereo Pipeline is a suite of automated geodesy & stereogrammetry tools designed for processing planetary imagery captured from orbiting and landed robotic explorers on other planets.
Apache License 2.0
478 stars 168 forks source link

parallel_stereo (and stereo_*) function write to munged log files #332

Closed ladoramkershner closed 3 years ago

ladoramkershner commented 3 years ago

Describe the bug Successive logging attempts from stereo_pprc, stereo_corr, stereo_fltr, and stereo_tri write to uniquely munged file names. Any logging attempts from stereo_rfne write to uniquely munged file names.

To Reproduce Steps to reproduce the behavior:

  1. run parallel_stereo on a cluster, or piping stdout into a file
  2. less output file and search for 'Writing log info'

Expected behavior I would expect all relevant logging information to be written to the same file name

Error Logs, Terminal Captures, Screenshots I am running these jobs on a cluster and when search the output log for ‘Writing log info’, I see the following patterns: stereo_pprc log writing behavior, only writes to log once results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-log-stereo_pprc-04-03-1227-6313.txt this look normal!

stereo_corr log writing behavior, first log write out looks normal results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-log-stereo_corr-04-03-1228-7259.txt then…it starts writing out file names like these results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-2048_0_2048_2048/2048_0_2048_2048-log-stereo_corr-04-03-1228-9992.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-0_0_2048_2048/0_0_2048_2048-log-stereo_corr-04-03-1228-9991.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-4096_2048_575_2048/4096_2048_575_2048-log-stereo_corr-04-03-1228-10046.txt

stereo_rfne log writing behavior, start out writing to a munged log file name results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-0_0_2048_2048/0_0_2048_2048-log-stereo_rfne-04-03-1229-12041.txt then continues munging successive log write outs results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-4096_0_575_2048/4096_0_575_2048-log-stereo_rfne-04-03-1229-11969.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-2048_0_2048_2048/2048_0_2048_2048-log-stereo_rfne-04-03-1229-11973.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-0_2048_2048_2048/0_2048_2048_2048-log-stereo_rfne-04-03-1229-11970.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-2048_2048_2048_2048/2048_2048_2048_2048-log-stereo_rfne-04-03-1230-12557.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-4096_10240_575_830/4096_10240_575_830-log-stereo_corr-04-03-1228-10293.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-2048_8192_2048_2048/2048_8192_2048_2048-log-stereo_corr-04-03-1228-10307.txt

stereo_fltr log writing behavior, only writes to log once results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-log-stereo_fltr-04-03-1241-16252.txt looks normal!

stereo_tri log writing behavior, first log write out is normal results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-log-stereo_tri-04-03-1243-16956.txt then… results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-0_0_2048_2048/0_0_2048_2048-log-stereo_tri-04-03-1243-19649.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-2048_0_2048_2048/2048_0_2048_2048-log-stereo_tri-04-03-1243-19650.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-4096_2048_575_2048/4096_2048_575_2048-log-stereo_tri-04-03-1243-19893.txt results_ba/B18_016575_1978_XN_17N282W__B17_016219_1978_XN_17N282W_ba-2048_6144_2048_2048/2048_6144_2048_2048-log-stereo_tri-04-03-1243-20000.txt

in the munged file names there is an extra string, along the lines of "-2048_6144_2048_2048/2048_6144_2048_2048", between the out prefix and the "log-[func]-[datetime]". A lot of these added strings seem to be made up of powers of two, but not all. All successive log write outs are directed at unique log file names.

Your Environment (please complete the following information):

oleg-alexandrov commented 3 years ago

The logic here is that each log file has the process id, tile bounds, and process name. It is a little ugly but does it cause any issues?

oleg-alexandrov commented 3 years ago

The stereo_fltr and streo_pprc tools write only one log file because those steps are not parallelized.

ladoramkershner commented 3 years ago

@oleg-alexandrov you are totally correct and now I feel awful using the word 'munged'. This is entirely an operator error. Thank you for your quick response!

oleg-alexandrov commented 3 years ago

Well, munged it is. :) Those log files help though when something goes wrong in processing.

ladoramkershner commented 3 years ago

More "strategically specific" naming 😆