Is your feature request related to a problem? Please describe.
When an algorithm defines more than 1 file input type, there is no convenient means to determine which file automatically downloaded to the input directory corresponds to which file input. This is because a user can supply any URL as a file input, and thus the name of the downloaded file is not known in advance. Thus, when more than 1 file is placed in the input directory, there's no way to know which file is for which file input parameter.
Describe the solution you'd like
For each file input, automatically provide the full path of the downloaded file as an additional positional argument to the algorithm's run command.
Further, to make this a non-breaking change, add such arguments to the end of the values supplied for positional arguments, and in the same order that the file inputs are defined.
Effectively, handle file inputs as if they were defined as positional inputs after the explicitly defined positional inputs.
For example, assume an algorithm defines 2 file inputs and 3 positional inputs, in the following order (although the file and positional sections could be reversed without any impact on the result), excluding details:
Currently, DPS will call the run command defined in the algo config file with 3 positional arguments, one for each of the positional inputs defined above, like so:
RUN_SCRIPT 'pos1 value' 'pos2 value' 'pos3 value'
This leaves the run script with having to figure out which of the 2 files downloaded for the 2 file inputs correspond to witch of the inputs because the run script has no way of knowing in advance what URLs the user supplied as inputs, and thus know way of knowing the corresponding filenames in the input directory.
I propose that DPS simply tacks on the absolute paths of the downloaded files as additional arguments to the run script, like so:
where 'abs path1' and 'abs path2' are the absolute paths of the files downloaded for the url1 and url2 file inputs, respectively.
(In conjunction, allow file inputs to be optional, just like positional inputs can be. I believe that currently, you must supply a value URL for every file input, but a file input should be allowed to be empty, just like a positional one can be.)
This way, the run script can look for the absolute paths of file inputs at the end of the other positional inputs.
By placing them after the "true" positional inputs, existing scripts will not break, because they will still see the positional inputs in the same positions.
Describe alternatives you've considered
The alternative is to define duplicate inputs for each file input: one is the original file input, and the duplicate is a positional input that provides the name of the downloaded file. This is very annoying and error-prone, and can also be very confusing. This not only requires redundant input entries in the algo config, but also relies on users having to duplicate inputs by supplying both a URL for the file input and a filename for the corresponding positional argument, which is obviously error-prone and annoying.
For example, the inputs shown above have to be modified like the following in order to get the desired behavior:
Where, filename1 and filename2 are the filenames in the input directory corresponding to the files downloaded from url1 and url2, respectively.
Unfortunately, this means a user must now supply 7 inputs instead of only 5, and the last 2 inputs (of the 7) must match the filenames at the end of the 2 file inputs (urls), which is the annoying and error-prone bit.
Is your feature request related to a problem? Please describe.
When an algorithm defines more than 1 file input type, there is no convenient means to determine which file automatically downloaded to the
input
directory corresponds to which file input. This is because a user can supply any URL as a file input, and thus the name of the downloaded file is not known in advance. Thus, when more than 1 file is placed in theinput
directory, there's no way to know which file is for which file input parameter.Describe the solution you'd like
For each file input, automatically provide the full path of the downloaded file as an additional positional argument to the algorithm's run command.
Further, to make this a non-breaking change, add such arguments to the end of the values supplied for positional arguments, and in the same order that the file inputs are defined.
Effectively, handle file inputs as if they were defined as positional inputs after the explicitly defined positional inputs.
For example, assume an algorithm defines 2 file inputs and 3 positional inputs, in the following order (although the
file
andpositional
sections could be reversed without any impact on the result), excluding details:Currently, DPS will call the run command defined in the algo config file with 3 positional arguments, one for each of the
positional
inputs defined above, like so:This leaves the run script with having to figure out which of the 2 files downloaded for the 2 file inputs correspond to witch of the inputs because the run script has no way of knowing in advance what URLs the user supplied as inputs, and thus know way of knowing the corresponding filenames in the
input
directory.I propose that DPS simply tacks on the absolute paths of the downloaded files as additional arguments to the run script, like so:
where
'abs path1'
and'abs path2'
are the absolute paths of the files downloaded for theurl1
andurl2
file inputs, respectively.(In conjunction, allow file inputs to be optional, just like positional inputs can be. I believe that currently, you must supply a value URL for every file input, but a file input should be allowed to be empty, just like a positional one can be.)
This way, the run script can look for the absolute paths of file inputs at the end of the other positional inputs.
By placing them after the "true" positional inputs, existing scripts will not break, because they will still see the positional inputs in the same positions.
Describe alternatives you've considered
The alternative is to define duplicate inputs for each file input: one is the original file input, and the duplicate is a positional input that provides the name of the downloaded file. This is very annoying and error-prone, and can also be very confusing. This not only requires redundant input entries in the algo config, but also relies on users having to duplicate inputs by supplying both a URL for the file input and a filename for the corresponding positional argument, which is obviously error-prone and annoying.
For example, the inputs shown above have to be modified like the following in order to get the desired behavior:
Where,
filename1
andfilename2
are the filenames in theinput
directory corresponding to the files downloaded fromurl1
andurl2
, respectively.Unfortunately, this means a user must now supply 7 inputs instead of only 5, and the last 2 inputs (of the 7) must match the filenames at the end of the 2 file inputs (urls), which is the annoying and error-prone bit.
Additional context
None