WorkflowConversion / CTDConverter

Series of python scripts to convert CTD files into other formats such as Galaxy, CWL
MIT License
5 stars 8 forks source link

produces incomplete output tag #20

Closed mwalzer closed 7 years ago

mwalzer commented 7 years ago

The converter produces

<outputs>
    <data name="param_out" format="data"/>
</outputs>

I think, there are two issues with that:

What I presume would be better:

<outputs>
    <data name="param_out" type="data" format="idXML"/>
</outputs>

from what is available in the ctd:

<ITEM name="out" value="" type="output-file" description="Output file" required="true" advanced="false" supported_formats="*.idXML" />
mwalzer commented 7 years ago

actually, as galaxy has a lot of ms formats already listed in their file-types listing, but all in lowercase, a .toLower() might be good

<outputs>
    <data name="param_out" type="data" format="idxml"/>
</outputs>
bgruening commented 7 years ago

@mwalzer the output does not have a type in Galaxy. The format should really be fixed. Once we discussed a file that maps the different filetype annotations or use EDAM for it.

chahuistle commented 7 years ago

It seems to me that the function create_output_node (line 1269 in generator.py) would be the one to modify. This is where generator.py tries to obtain the right value for the format attribute. I am just worried about the type attribute in data elements, as @bgruening commented.

mwalzer commented 7 years ago

Yes, @bgruening is right, according to the galaxy doc no type attribute. It should then looke something like:

<outputs>
    <data name="param_out" format="idxml"/>
</outputs>
chahuistle commented 7 years ago

Right now, formats are kept as provided. A temporary work around would be to modify your CTDs so their format is "idxml" and not "idXML".

I guess an extra parameter can be added "make formats all upper-/lowercase", and depending on that, formats will be left alone (as it is now), uppercased or lowercased.

mwalzer commented 7 years ago

No, I think right now, by default, 'data' is written as a format. Only if you supply a formats_file that contains this specific format, the type gets written (as mapped in the formats_file). example CTD file attached

chahuistle commented 7 years ago

Can you provide an input file and the command line to replicate this on my end?

chahuistle commented 7 years ago

Well, anyway, missing output format, given that an input contains supported_formats, is a bug and this should be fixed.