dnanexus-archive / viral-ngs

viral-ngs
6 stars 6 forks source link

metagenomics applet: upload as it runs? #28

Closed dpark01 closed 8 years ago

dpark01 commented 8 years ago

Currently, the metagenomics applet iterates through the input files, produces reports, and uploads all the results to S3/DNAnexus at the end. Is it possible to have it upload the outputs of each sample during the iteration loop, so that results can be seen more in realtime?

yifei-men commented 8 years ago

Hi Danny,

The typical abstraction/architecture of DNAnexus jobs makes this a little bit harder than expected 😦

The process of uploading an output goes through something like:

[Linux container where commands are run] >> dx upload >> [workspace container] >> dx cp >> [project folder]

We can trigger the dx upload as we go, but the dx cp is gated and will only be triggered when the whole job completes (and is handled by our background manager)... so this actually doesn't really achieve what you're asking for in term of seeing results more in real time, because on the UI the files won't show up as links when they're still in the workspace container:

It'll look something like this:

screen shot 2016-07-06 at 3 38 05 pm

Is the duration of the entire metagenomics step of the workflow long enough that it's getting annoying to wait for it to complete?

@mlin Is this accurate from your understanding / do you have any other suggestions?

dpark01 commented 8 years ago

Hi @yifei-men

Yeah I wasn't sure if what I was asking for was even easily feasible. If it was, I figured why not, but given what you're describing, I don't think it's worth the effort. The total runtime is fine--this is just something I was thinking as folks were waiting for a particular run of interest with baited breath and I was wondering if it was an easy tweak.

I'm going to close this since it's not terribly important and, if it became important, there might be better ways to do it.