Closed jotelha closed 4 years ago
Thanks! Can you clarify the reason for switching from dot to arrow? You wrote "allows for nested queries to be stored" but I don't think I understood why a nested query can't be stored if there is a "." in it.
Queries are expected as nested dicts and not as plain strings, thus the -> aliasing for the . (dot) separator allows to store queries like this:
ft = GetFilesByQueryTask(
query={
'metadata->project': project_id,
'metadata->type': 'surfactant_file',
},
sort_key='metadata.datetime',
sort_direction=pymongo.DESCENDING,
limit=1,
new_file_names=['default.pdb'])]
If I remember correctly, the MongoDB language does not allow for dots in keys. It's the same issue as in the dict_mods.py file at https://github.com/materialsproject/fireworks/blob/07bace776fedefd09907272334a2c5925ffce51d/fireworks/utilities/dict_mods.py#L55-L58.
Ah yes, I remember - MongoDB doesn't allow storing dictionaries where the keys have a dot in them. So storing the parameter:
{"query": {"key.subkey": "value"}}
can't be done - making it difficult to serialize the FireTask. The arrows should indeed make it possible to store the query and thereby serialize the Firetask. Merging this now along with the other improvements, thanks!
Exactly. A related note: Similarly, I believe, it is not possible to store any query involving $-prefixed operators, i.e.
{'metadata.datetime': {'$gt': '2020'} }
thus it might be a good idea to store queries as plain strings instead. Are there any mongo-language-specific serialization recommendations for query documents?
I don't know of any mongo language specific serialization recommendations; it's possible that a simple string is best.
As an aside, it looks like as of MongoDB 3.6+, dots are allowed in key names. But dollar sign prefixes are still prohibited:
https://docs.mongodb.com/manual/reference/limits/#Restrictions-on-Field-Names
A suggestion:
Turns the previously contributed "GetFilesByQueryTask" into something useful:
Other additions:
Best regards,
Johannes