A dedicated indexing script has a number of advantages:
Implement complicated logic for optimising indexing, without having to alter the basic API which is ill suited for this task
Could impose some restrictions on indexing to improve quality of the database meta-data. Specifically could require a metadata.yaml file to be present with some essential fields filled before indexing the data
Requiring a metadata.yaml file would simplify the process of deciding what is an experiment, as it is then simply a directory containing a metadata.yaml file
With above changes could use a yaml configuration file to define which directories to index
A config file would allow for the concept of "collections" with associated meta-data, i.e. an extra layer in the hierarchy in the DB. This would allow for experiment name degeneracy, as experiment names might only have to be unique within collections.
A dedicated indexing script has a number of advantages:
metadata.yaml
file to be present with some essential fields filled before indexing the datametadata.yaml
file would simplify the process of deciding what is an experiment, as it is then simply a directory containing ametadata.yaml
fileyaml
configuration file to define which directories to index