sphuber / aiida-shell

AiiDA plugin that makes running shell commands easy.
MIT License
14 stars 7 forks source link

`ShellParser`: Fix output files with non-alphanumeric characters #10

Closed sphuber closed 2 years ago

sphuber commented 2 years ago

Fixes #8

Output files are attached using their filename as a link label. However, the validation rules for link labels are a lot more restrictive compared to filenames. The former can only contain alphanumeric characters and non-consecutive underscores.

The ShellParser already replaced . characters, often found in filenames used for the extension, into underscores, but these are not the only illegal characters for link labels that can occur, such as dashes etc. The parser is updated to replace any invalid character with an underscore where consecutive underscores are merged into one.

bilke commented 6 months ago

Current implementation does not work for filenames with a leading numeric character, see https://www.w3schools.com/python/ref_string_isidentifier.asp:

A valid identifier cannot start with a number, ...

The leading numeric should be replaced with some prefix in format_link_labels(). I tested with aiida_ as a prefix:

def format_link_label(filename):
    leading_numeric = re.sub('^[0-9]', 'aiida_$&', filename)
    alphanumeric = re.sub('[^0-9a-zA-Z_]+', '_', leading_numeric)
    link_label = re.sub('_[_]+', '_', alphanumeric)
    return link_label

But maybe there is a better prefix (maybe aiida_shell_)?