Closed hmacdope closed 8 months ago
This is fixed by a function that truncates file name lengths aspadiscovery.data.utils.check_name_length_and_truncate
in #246
this highlights the need for us to start using some form of UUID for compounds. At least for the FECs workflow we're still using InChi for compound names, these can exceed >80 chars and are thus truncated, but this risks creating duped names between compounds.
We should probably start using the low-level compound identifier that the postera API uses (it's a random string so should be fine to use)
Reopening as a discussion point.
We do this now
Beware very long (> 80 char) file names and titles in SDFs, these can cause issues with components including RDKit and openeye parsing including segfaults.