Open hechth opened 2 years ago
@hechth you are talking about that tool? https://github.com/bgruening/galaxytools/blob/master/chemicaltoolbox/openbabel/ob_convert.xml
Which input format are you using?
@bgruening Indeed!
I'm using a normal list, so the inchi
format how it is called.
Some example data is attached. inchi.zip
Can you try adding an additional column (https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/add_value/addValue/1.0.0) to the inchi file? Is that preserved by openbabel?
I tried using 2 columns separated with ,
, that didn't change anything on the specific history (https://umsa.cerit-sc.cz/u/hechth/h/compound-convert-test).
try adding a new column with a tab using the tool from above
Nope - tried adding a column manually, using tabs, commas, the Galaxy tool, but always the same - no index in the output and invalid data gets dropped silently.
Maybe @simonbray has an idea? This tool is using simply openbabel, so if openbabl can not deal with this I think we are out of luck here.
Can you use a different file format? I think inchi is in general not a good choice for the input.
With smiles or sdf you can specify the index in the molecule name/title.
I explicitly want the inchi, since I want to compute smiles from inchi.
I also don't get why indexing is possible with SMILES and not with inchi? They're both just texts ...
I also don't get why indexing is possible with SMILES and not with inchi? They're both just texts ...
What I meant is that SMILES has a name/title/label which you can append a index to.
I explicitly want the inchi, since I want to compute smiles from inchi.
I think as @bgruening said we are limited by the underlying software. Maybe you can use a Galaxy workaround like this? https://usegalaxy.eu/u/sbray/h/inchi-index
In this scenario the join works as the inchi doesn't change - but if we actually change specific parts of it, they are no more identical, so the workaround doesn't function.
If I come up with a solution, should I just PR it here? Otherwise, I think I could solve our specific needs with a targeted tool.
Thank you very much for your support and for looking into this!
Yes, PRs are always welcome, thanks!
The
compound conversion
tool which is part of the chemical toolbox doesn't handle indices etc. for the files which it processes and silently drops lines that are invalid - this makes working with larger files problematic, as the output format can no more be associated with the inputs.Is there a way to add indices to the files to indicate which output belongs to which input or is the only option to run collections and have one identifier per job?