Open bernt-matthias opened 4 years ago
True. This would be much better but this also needs to be reflected in the CTD for other wrappers. We do not yet support its "output prefix capabilities" but it is probably not too much work (it is basically just another tag).
There are seemingly already some other places where directories are reflected as simple string. A quick grep in the ctd files gave me this list:
MzMLSplitter -out PepNovoAdapter -dir IDRipper -out MascotAdapter -mascot_directory temp_data_directory InspectAdapter -inspect_directory temp_data_directory PrecursorIonSelector -tmp_dir IDFileConverter -in OpenSwathFileSplitter -outputDirectory
Maybe adding a simple string option would be a good first step.
For the case of IDRipper there is -out_path
and -out
:
-out <file> The path to this file is used as the output directory. (valid formats: 'idXML')
-out_path <file> Directory for the output files after ripping according to 'file_origin'. If 'out_pa
th' is set, 'out' is ignored.
When specifying -out_path DIR
the files are located in the current working dir. Only with -out_path DIR/FILE
(where FILE does not even need to be existent) the generated files are located in DIR
.
Also DTAExtractor
seems to be an example
Background: For the CTD -> galaxy tool the automatic conversion of the consensusXML input case seems impossible. I'm wondering if there is a workaround.
So how would this PREFIX help Galaxy? Would you assume all files with this prefix in the target directory where generated by the tool?
Exactly. The prefix would be an existing directory in Galaxy. Galaxy has means to take all files from an directory (optionally matching a regexp).
Galaxy does not know how many outputs are there and how they are named. So the easiest seems to be to use prefix. An alternative would be to implement some additional logic in the Galaxy tools. But these would need to be specific for the tools which seems difficult to automate.
For cases like IDRipper -out_path
or MzMLSplitter -out
the new parameter type simplifies the conversion.
Ok, but then the safest thing to do is to create new temp-directory, have the tool create all its output files in there and just grab all the files. The prefix solution is not very safe, since there might be other files in the same directory from previous runs or who knows ...
This is exactly what Galaxy does since each job (i.e. every singe call to a openms binary) has its own working dir.
I guess the parameter could implement a check if there is a file matching PREFIX.*
I would hope the prefix can include subfolders. If not, please add this feature. E.g. if the prefix is tmp/foo it gathers all files matching a certain mimetype (if specified): i.e. $pwd/tmp/foo*.ext
The prefix can be anything. But so far the folders need to exist already (similar to normal output files which could also refer to filenames in subfolders).
The code currently checks if the prefix is writable:
Is there already some code in OpenMS to list files in a glob like manner (eg PREFIX*.ext
)? Or are there any suggestions how to implement this.
For consensusXML input the number of outputs depends on the contents of the input.
In order to automatize this step it would be great to just specify a prefix of the outputs.
For instance, one could have an optional
-out FILE
argument that needs to be specified if the input is not consensusXML and-out_prefix PREFIX
if the input is otherwise.Background: For the CTD -> galaxy tool the automatic conversion of the consensusXML input case seems impossible. I'm wondering if there is a workaround.