Closed effigies closed 7 years ago
Could you explain a bit more what role the reportnode
would play?
Please keep in mind that with the current architecture we need one datasink per reportlet, because we want as many reportlets in a partial report (when something fails). If we connect multiple reportles to one data sink and one of the interfaces creating reportlets fails none of them will be saved and all of them will be missin in the final report.
I might be missing something. Why is a datasink needed, as opposed to an interface that grabs the reportlets from their original locations in the working directory?
But to be (hopefully) a little clearer, the reportnode would collect and give named ports for each of the reportlets that a workflow generates. If the reportlet is undefined, we can skip over it, as we do now. The only thing that needs a sink here is the final report.
If this sounds like too much of a pain, NBD. I just find the reportlet derivatives to be very cluttering when I'm trying to understand a new workflow (or part of a workflow).
On Wed, Apr 19, 2017 at 11:13 AM, Chris Markiewicz <notifications@github.com
wrote:
I might be missing something. Why is a datasink needed, as opposed to an interface that grabs the reportlets from their original locations in the working directory?
We could do that, but this means we will heavily depend on the structure of the working directory not changing over time.
But to be (hopefully) a little clearer, the reportnode would collect and give named ports for each of the reportlets that a workflow generates. If the reportlet is undefined, we can skip over it, as we do now. The only thing that needs a sink here is the final report.
Ok - get it.
If this sounds like too much of a pain, NBD. I just find the reportlet derivatives to be very cluttering when I'm trying to understand a new workflow (or part of a workflow).
You mean the reportlet datasinks?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/poldracklab/fmriprep/issues/456#issuecomment-295374016, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOkp7OC0nktFL7mroRaqfJYP8cbplzCks5rxk7MgaJpZM4NB0Eo .
We could do that, but this means we will heavily depend on the structure of the working directory not changing over time.
Will it? Since the full reportlet path is what's passed to the report-constructor, it seems like this should be fairly robust.
You mean the reportlet datasinks?
Yeah.
If I understand this idea correctly it will suffer from the same problem as a single datasink.
Basically if you connect reportles from many nodes to a single report-constructor node and one of the reportlet building node fails the report-constructor will never be run (since not all of its dependencies run successfully) and the report will not be generated at all.
Yes, that's true. Hmm. Guess I didn't think this through after all.
I do agree that having reportlet datasinks inside workflows is a bit confusing and not very neat.
There might be a hacky way of doing this where, instead of a report-generator interface, we could have a workflow that's just a bunch of independent datasinks, so at least we could manage some separation. But that might be more work than it's worth...
So I've done a little fiddling in the anatomical workflows to see how much of a pain it would be to separate derivatives/reports from processing logic.
A few thoughts:
reportnode
is no more useful than an inputnode
on a report workflow, so scratch that idea entirely.outputnode
is cleaner, when available. It also gives an opportunity to keep names consistent; e.g. every time you see the binary mask in T1 space, it has the same name.outputnode
, even if we're not planning on hooking them up to other workflows?Finally, I've only done this on the anatomical workflow. If it seems worthwhile, I can do it for the rest (though not until after #458).
Rebased this to use the new visualizations:
Compare:
Indeed this does clean up things substantially!
I this issue still an issue? @effigies
By giving each workflow an
inputnode
,outputnode
andreportnode
, we can reduce the clutter within each workflow where sinking/derivatives are put on visual parity with processing nodes.Reportlets can then be passed to an interface/workflow that constructs the full report, isolated from the processing/reportlet construction.
Similarly, it would be nice (though possibly not feasible) to reduce output sinks to at most one per sub-workflow, and ideally fewer, grabbing outputs from
sub_wf.outputnode
.