Open bradbell opened 9 months ago
I re-ran the call to csv.predict for the case above and the error did not reproduce. To be specific, I got
Predict: n_predict = 7, n_spawn = 3
Begin: 04:34:28: predict 1_Earth.both
Begin: 04:34:28: predict 103_Latin_Am_Caribbean.female
Begin: 04:34:28: predict 103_Latin_Am_Caribbean.male
Begin: 04:34:28: predict 124_Central_Latin_Am.female
End: 05:33:06: predict 103_Latin_Am_Caribbean.female 1/7
Begin: 05:33:06: predict 124_Central_Latin_Am.male
End: 05:33:07: predict 103_Latin_Am_Caribbean.male 2/7
Begin: 05:33:07: predict 130_Mexico.female
End: 05:33:22: predict 1_Earth.both 3/7
Begin: 05:33:22: predict 130_Mexico.male
End: 05:33:28: predict 124_Central_Latin_Am.female 4/7
End: 06:19:57: predict 124_Central_Latin_Am.male 5/7
End: 06:21:07: predict 130_Mexico.female 6/7
End: 06:21:11: predict 130_Mexico.male 7/7
remove: diabetes_fpg_cv20_5cv_asymp_N_pre_1_Earth.both shared memory
The following error reporting and recovery was added to at_cascade to reduce the effect of this problem and maybe figure out when and why it is happening: https://github.com/bradbell/at_cascade/commit/8a42a42aebd47e0b3592f365e2f5aa44acd8cc7e This changes the assert above to
print( f'csv.predict: Cannot find {file_name}' )
If you see this message please report it below. (In this case the predictions for the corresponding job will be missing).
at_cascade-2024.1.30: It appears that sometimes one of the prediction jobs does not create its output files and pre_user_csv crashes when it tries to use those files. I have seen this twice and decided to report it the second time.
In the case below, there is a begin for 124_Central_Latin_Am.female but there is no end notice for this job.