neuroailab / tfutils

Utilities for working with tensorflow
MIT License
25 stars 8 forks source link

Issue with validating previously trained models #60

Closed anayebi closed 7 years ago

anayebi commented 7 years ago

When loading previously trained models to train them, I keep running into the issue pymongo.errors.OperationFailure: database error: Invalid ns [tnn_vanilla_alexnet_nesterov_full_training_dropout_trainval1___RECENT.$cmd].

I'm using base.test_from_params() and followed the tutorial in tests.py; and all the filters have been cached, yet it seems that in line 321 of base.py it is looking for a __RECENT database but can't find it for some reason.

I thought base.train_from_params() should cache the filters locally during training and save them in a __RECENT database via the cache_filters_freq entry.

anayebi commented 7 years ago

The issue is due to the restriction that Mongo dbnames must be less than 64 characters in length, so when the original dbname, collname, and exp_id are concatenated with __RECENT, this is over 64 characters, resulting in pymongo to silently not construct the database, thereby leading to the OperationFailure of the database not being found when it is calling count() in line 321 of base.py.

To remedy this issue, I will make a function, to be called in line 321 of base.py, such that upon the OperationFailure error will provide a more informative additional message that the database name could be too long.