Assoc Aggregate: Now With Good Scattering!

aofarrel commented 2 years ago

This implements option 2 from here: https://github.com/DataBiosphere/analysis_pipeline_WDL/pull/57#issuecomment-951353842

Not only does this bring this closer to the CWL, it also improves fixes the bug mentioned in the comment above. Most of the RData final outputs are still not passing against the truth files, but they are at least the correct number of files now.

One problem this does has is that it must localize all files from all segments in each chromosome. This might become problematic as the number of chromosomes increase. I'm not sure of a way around it...

Another problem: https://github.com/DataBiosphere/analysis_pipeline_WDL/issues/59

aofarrel commented 2 years ago

Group should be grouping per chromosome, that is its purpose. Try using read_tsv() method instead of read_lines().

aofarrel commented 2 years ago

If read_tsv() fails, we could try avoiding the issue by merging this with the subsequent task: https://github.com/DataBiosphere/analysis_pipeline_WDL/tree/assoc-agg-debugging-merge-tasks

DataBiosphere / analysis_pipeline_WDL

Assoc Aggregate: Now With Good Scattering! #58