Open rjuenemann opened 11 months ago
Good hypothesis! Indeed, fw_queue creates the task ScriptTask_compression_raw_data depending only on the completion of InitRawData and later adds a link for it to also depend on the InitValidationData task. This task runs bzip2. Checking in a local manual run, bzip2 replaced 15MB rawData.cPickle
with 3.5MB rawData.cPickle.bz2
.
(lpad get_fws
has an option --display_format {all,more,less,ids,count,reservations}
. Picking more
or all
would probably show the task dependency links to verify this expectation of the dependency links.)
So (going by the variables in fw_queue
rather than the task names) fw_parca_analysis
should be another "parent" (dependency, prerequisite) of the fw_raw_data_compression
task.
The code is almost there.
^^^ This makes fw_parca_analysis
a parent of fw_sim_data_1_compression
and fw_validation_data_compression
, that is, don't run those two compression tasks until fw_parca_analysis completes.
Just add fw_raw_data_compression
as another arg to add_links()
.
If this symptom is currently reproducible (the compression task could happen to run late enough sometimes to avoid the symptom), it's a good time to test the fix.
This raises other questions:
wcm.py
would need the same additional dependency but on checking the code, it doesn't add any compression tasks.
I ran the following on Sherlock on the
ng-trl-eff-shift-variant-only
branch in preparation for PR #1415I noticed the AnalysisParcaTask fizzled:
with the error
@ggsun and I suspect the issue is that the Parca output was compressed before the AnalysisParcaTask could run, creating the error in finding the needed file. Indeed,
rawData.cPickle.bz2
is found inout/kb
, butrawData.cPickle
is not.