Closed reddigari closed 4 years ago
I also agree with Diego's idea of having the CLI be a client of the Django app with files handled by the FileSystemStorage
. But I also liked your idea Samir of moving the file system into the Workflow constructor since this can condense the number of places we have to deal with the fs
object: define it once in the Workflow and then use the Workflow for all file operations/lookup.
I'm still reading through the PR and I don't have a fully-formed opinion yet, but I actually meant moving FileSystemStorage
entirely OUT of pyworkflow rather than entirely in. It's doesn't appear to be designed to work outside of a configured Django app (obviously not a problem because we are using pyworkflow inside a Django app), but it means pyworkflow will never stand alone. Again, that's not necessarily an issue for us, but it means we:
These aren't deal breakers, and if it's the quickest way to meet our requirements, I'm totally down with it. But if we're looking to keep the Django concerns in the Django app, is there any downside to just passing a prefix path into the Workflow object rather than passing or creating a FileSystemStorage
instance?
That does make more sense and I think is compatible with the latest PR (after some more changes).
In #47, a Workflow is created roughly like Workflow(file_system=FileSystemStorage())
. Then, Workflow methods use this by self._file_system.open()
.
As you mention, passing a prefix path is an option (that I don't see any downsides with). For the Django use case we could do this simply by Workflow(file_system=fs.location)
or even Workflow(file_system=settings.MEDIA_ROOT)
to avoid FileSystemStorage
altogether. For a CLI/other use case we could do something like Workflow(file_system='/root_dir_here')
. Then, Workflow methods would make standard Python I/O calls something like
file_path = os.path.join(self._file_system, file_to_open)
with open(file_path, 'r') as f:
...
with some polishing of course. I like that a lot better actually.
Just catching up here - sounds like y'all have a good direction. If we ended up needing more than this file_system
, an idea could be to factor out into a Context
class instance that gets passed around. Probably too heavy for just this case, but just thinking out loud.
Realized #42 was not the right place for this discussion, but this is a follow up on the comment I left there:
When we do deal with the CLI --- we could inject some filesystem abstraction into the
pyworkflow.Workflow
constructor. From the web app it would be Django'sFileSystemStorage
object, and for the CLI, it could be some custom wrapper of the user's filesystem (mounted in the Docker container? idk.) Then all files can be referred to by their with basenames, andpyworkflow
can look them up in whatever filesystem it was given.However, I'd say this complication is a point in favor of Diego's idea of having the CLI be a client of the Django app rather than of
pyworkflow
directly --- then everything can be done through the Django file storage API (although how the user will "upload" their data files when running in CLI mode is going to be tricky).