Open mikkonie opened 6 months ago
In addition to the file_size check it might be worthwhile to have different behaviour based on file type: i.e. only apply the check to specific file types (like fastq.gz / bam); or only raise WARN but not DENY for certain file types.
However, in practice the file size distinction will likely already catch most cases where duplicate files are not uploaded in error (like log files), so this might not be a necessary addition.
Requested by @holtgrewe. A user may accidentally upload a file to a landing zone that already exists in the project's sample repository, but with a different file name and/or under a different collection. Currently, this is not checked so duplicates can be uploaded. This should be preventable, at least for large files.
This task requires #1846 to be done first.
Spec (To Be Refined)
LANDINGZONES_DUPLICATE_SIZE_DENY
LANDINGZONES_DUPLICATE_SIZE_WARN
duplicate_size_deny
PROJECT
BOOLEAN
True
settings.LANDINGZONES_DUPLICATE_SIZE_DENY
duplicate_size_warn
0
0
,WARN
cannot be larger thanDENY
allow_duplicates
boolean field to theLandingZone
modelallow_duplicates
parameter forlanding_zone_move
taskflowlanding_zone_move
taskflowallow_duplicates==False
, skip all of the following)VALIDATING
phase, check for dupesTasks
LandingZone.allow_duplicates
field and taskflow parameterlanding_zone_move
flow