Closed reelmatt closed 4 years ago
One of the ideas I had behind the intermediate classes like IONode
and ManipulationNode
was it could be useful to store information common to any node of that type (e.g. 'files' for IONodes, and ¯_(ツ)_/¯ for ManipulationNodes). ReadCsv
would then treat that 'file' as input and WriteCsv
would treat it as output, but both could be read/accessed similarly. And this could avoid having to specify 'input_file' or 'output_file' when configuring the Node; you could just have 'file' and then the concrete Node implementation would figure it out as long as some value is provided.
Likewise, this could probably help when creating nodes too. I think we could avoid checking the OPTION_TYPES
completely à la
for field, info in json_data.get("option_types", dict()).items():
if info["type"] == "file" or info["name"] == "Filename":
...
and instead just check DEFAULT_OPTIONS
if we knew any file/filepath is stored in a generic "file" attribute.
json_data["options"]["file"]
This latest commit—51c4ee5
— does change file handling a bit more than the initial PR which moved FileSystemStorage
solely to pyworkflow/workflow.py
. The API is now not used, an instead settings.MEDIA_ROOT
is passed in to the Workflow constructor and regular Python os
calls are used.
I think this better reflects the discussion from #43, but is obviously a bigger change with potentially big implications. If this deviates too far from what others are thinking, we can always revert to before entirely, or c664947
from earlier in this PR.
This PR at least partially addresses some of the bugs @reddigari summarized in #43 dealing with file naming/access.
~Workflow still relies on Django's
FileSystemStorage
API, but is now passed in to the constructor rather than being defined in multiple places:pyworkflow/node.py
and workflow/nodeviews.py
. Most file operations are still handled in theWorkflow
class, but thecreate_node
method can now do the following:~when a user uploads a file during node configuration, or when a download is triggered from
WriteCsv
. If a user enterstest.csv
for the output file, when theupdate
endpoint is triggered, this is converted to/tmp/test.csv
using the Workflow's file system.This helps solve the "giant security problem" of downloads, if we switch the endpoint to POST with the Node's info. It's still hard-coded for the
WriteCsvNode
because it looks up the Node'spath_or_buf
option where the output filename is stored, but this does prevent arbitrary downloads.~The
store_node_data
andretrieve_node_data
static methods in Workflow haven't been updated but I think (?) these can be done easily if this approach seems good.~