TUM-DAML / seml

SEML: Slurm Experiment Management Library
Other
165 stars 29 forks source link

Simple fix for no-hash duplicate detection in Mongo database #89

Closed LoadingByte closed 2 years ago

LoadingByte commented 2 years ago

Expected Behavior

seml XXX add YYY --no-hash should do the same thing as seml XXX add YYY.

Actual Behavior

It doesn't once you have nested entries in your config; specifically, in these cases, it adds duplicate configs to the database.

Fix

The offending statement is (https://github.com/TUM-DAML/seml/blob/master/seml/add.py#L43-L45):

lookup_dict = {
    f'config.{key}': value for key, value in config.items()
}

Once you have nested dictionaries here, MongoDB no longer correctly parses the query which uses the above dict as filter dict. This can be fixed by replacing those three lines with, for example:

lookup_dict = flatten({'config': config})

where flatten is imported from seml.utils.

gasteigerjo commented 2 years ago

Thank you for finding this and suggesting this elegant solution! I just pushed the suggested fix. :)