iRNA-COSI / APAeval

Community effort to evaluate computational methods for the detection and quantification of poly(A) sites and estimating their differential usage across RNA-seq samples
MIT License
13 stars 14 forks source link

Bug: GETUTR file not found error in postprocessing step when using huge sample file #439

Closed faricazjj closed 2 years ago

faricazjj commented 2 years ago

Fixes #438

Background When running GETUTR with a huge sample file, postprocess step will encounter a file not found error from DaPars. An example of the error message is below:

Screen Shot 2022-10-13 at 1 33 30 PM

This issue is however never encountered when using our test files that are small.

Problem After investigating the workflow, the issue seems to be that getutr_process.nf publishes the file from GETUTR to a location in the output directory. The postprocessing step then reads from the file published to the output directory. When the output is big, the file publishing would take a little longer, but the workflow would already move on to the next process which is postprocessing step. Hence, the file is not found because it wasn't done being created.

Solution One way to solve this is to avoid reading from a published file because that would take some time. Instead, we could read the file from the output channel of the previous process.

This solution works and the workflow finishes without error: Screen Shot 2022-10-14 at 2 38 58 PM

Checklist