Closed maximskorik closed 3 years ago
@maximskorik This should actually be only loading the Name from the compound table if it is there - getting the name from the formula is not possible due to isomers, so I'd rely on reading it from the database.
@martenson What is your opinion on optionally reading a column if it is there and ignoring it if it isn't? I feel like it adds extra programming effort to respect which things are optional and which aren't while the benefit is not immediately visible to me, except for having a smaller and minimalistic data frame inside the program.
@hechth Seems like a harmless approach to me, unless it makes the file significantly larger. I think the decision is not technical but rather based on the scientific/user benefits.
Having the name in there right from the start is beneficial for debugging and having them in the output is crucial, also since our IDs aren't canonical (coming from HMDB, KEGG or PubChem etc.) and relying on that is also dangerous. Relying on the name is likely also not ideal, but for now better than the ID IMHO.
@maximskorik This should actually be only loading the Name from the compound table if it is there - getting the name from the formula is not possible due to isomers, so I'd rely on reading it from the database.
That's a good point. I forgot to consider isomers.
The October version of the advanced annotation outputs the annotation table without compound names. This should be fixed to make the tool more user-friendly. The names are not used anywhere in the annotation pipeline, so they can be added to the output at the end of the annotation.
- [ ] Add functionality to obtain IUPAC compound names from empirical chemical formula