artefactual-sdps / enduro

A tool to support ingest and automation in digital preservation workflows
https://enduro.readthedocs.io/
Apache License 2.0
4 stars 3 forks source link

Problem: processing workflow has separate code paths for filesystem and minio watchers #867

Open djjuhasz opened 4 months ago

djjuhasz commented 4 months ago

Describe the problem

The processing workflow and bundle activity use conditional branches to manage differences between the filesystem watcher and minio watcher implementations. The branching logic is hard to understand, requires some code duplication and makes the assumption that only a filesystem watcher can provide a "blob" that is a directory.

Possible solutions

The differences between the two watchers could be managed by moving the difference in implementation for copying the blob data to a local "processing" directory into their individual implementations instead of requiring branching logic (see "Avoiding conditionals by obeying the Liskov Substitution Principle" - https://sandimetz.com/99bottles).

Additional context

djjuhasz commented 4 months ago

This is complicated by the fact that the current watcher.Service.Download() method takes an io.Writer parameter as the destination to write the byte stream. An io.Writer stream works for a single file, as we expect from a minioWatcher, but doesn't work for a filesystem directory as we might receive from a filesystemWatcher.