Closed chitayata closed 2 years ago
Just a quick update.
I wasn't able to get the results to print automatically, however, I was able to use the following code to load and show the '_results.h5' file created during training.
import flammkuchen
import logging
logging.basicConfig(level=logging.INFO)
res = flammkuchen.load(f'pathway_results.h5') #Load the results.h5 file
res #print results
Still no luck with parameter optimization within python though...
Hi Jan and team,
Any thoughts on how can best optimize my parameters within Python?
I am working to automate the detection of chimpanzee vocalizations (pant-hoots) and I'd like to investigate different network configurations in combination with the different spectrogram denoising techniques (e.g., frequency removal, spectral subtraction). The parameters I'd like to test include: 1) TCN blocks - 2, 3, 4 2) Number of Filters - 32, 64, 96 3) Learning rate: 0.0001, 0.00001
However, this ends up being quite a large search space. How can I best test out different configurations and compare their performance?
Thanks!
Hi, glad you figured out a way to access the test results.
Regarding the optimization: The search space is not that big - only 3x3x2=18 combinations. You could brute-force this and try all combinations using das.train.train
and a for loop.
We also have experimental support for automatic parameter tuning via keras tuner in das.train_tune
. The interface is similar to das.train
.
Python: das.train_tune.train(data_dir='tutorial_data.npy', save_dir='res', kernel_size=3, tune_config='tune.yml')
. There is also a CLI that you can access via das tune
.
Crucially, it accepts a yaml file with the parameter names and values you want to optimize. In your case, the tune.yml
file would look like this:
nb_conv: [2, 3, 4]
nb_filters: [32, 64, 96]
learning_rate: [0.0001, 0.00001]
The tuner will then run a bunch of fits and find an optimal parameter combination in this search space - see keras tuner for how this works.
But again, in your case I think it would be easier to just run all 18 fits a couple of times to optimize these parameters.
Good luck and let me know how this goes. In particular, if you use the tuner and run into issues, because we have so far only used this internally. Also happy to have a look at your data with you to give some advice on the model parameters.
Hi Jan,
Great, I will first try a for loop, but may also try the automatic tuner to see how it works and if I end up with different results. I will let you know how it goes. I am occasionally running into memory iussues however, so hopefully it won't be a problem to run.
It would be great to your input on model parameters though! I will send an email with my data to the one listed your site.
Many thanks for your suggestions.
Hi Jan,
I'm trying out the automatic tuner via keras and running into some issues. I started with something simple just to see if it works but I'm not seeing any results and I am getting an error I don't understand. Suggestions?
My yml file:
nb_filters: [32, 64, 96]
Code I am running :
das.train_tune.train(data_dir='normalized_FR.npy', save_dir='normalized_FR.res', kernel_size=16, nb_epoch=4, tune_config='tune.yml')
Output and error message: `Trial 2 Complete [00h 01m 44s] val_loss: nan
Best val_loss So Far: nan Total elapsed time: 00h 04m 27s INFO:tensorflow:Oracle triggered exit
INFO:tensorflow:Oracle triggered exit
Results summary Results in normalized_FR.res\20220921_104822 Showing 10 best trials Objective(name='val_loss', direction='min') Trial summary Hyperparameters: nb_filters: 32 Score: nan Trial summary Hyperparameters: nb_filters: 16 Score: nan
OSError Traceback (most recent call last) Cell In [4], line 3 1 # experimental support for automatic parameter tuning via keras tuner in das.train_tune. The interface is similar to das.train 2 # importantly it accepts a yaml file with the parameter names and values you want to optimize. ----> 3 das.train_tune.train(data_dir='normalized_FR.npy', save_dir='normalized_FR.res', kernel_size=16, nb_epoch=4, tune_config='tune.yml')
File ~\miniconda3\envs\das\lib\site-packages\das\train_tune.py:592, in train(data_dir, x_suffix, y_suffix, save_dir, save_prefix, save_name, model_name, nb_filters, kernel_size, nb_conv, use_separable, nb_hist, ignore_boundaries, batch_norm, nb_pre_conv, pre_nb_dft, pre_kernel_size, pre_nb_filters, pre_nb_conv, upsample, dilations, nb_lstm_units, verbose, batch_size, nb_epoch, learning_rate, reduce_lr, reduce_lr_patience, fraction_data, seed, batch_level_subsampling, augmentations, tensorboard, wandb_api_token, wandb_project, wandb_entity, log_messages, nb_stacks, with_y_hist, balance, version_data, tune_config, nb_tune_trials, _qt_progress) 590 else: 591 logging.info('re-loading last best model') --> 592 model, params = utils.load_model_and_params(params['save_name']) 594 logging.info('predicting') 595 # TODO: Need to update params with best hyperparams (e.g. nb-hist)
File ~\miniconda3\envs\das\lib\site-packages\das\utils.py:126, in load_model_and_params(model_save_name, model_dict, custom_objects) 115 """[summary] 116 117 Args: (...) 123 keras.Model, Dict[str, Any]: [description] 124 """ 125 params = load_params(model_save_name) --> 126 model = load_model(model_save_name, model_dict=model_dict, custom_objects=custom_objects) 127 return model, params
File ~\miniconda3\envs\das\lib\site-packages\das\utils.py:43, in load_model(file_trunk, model_dict, model_ext, params_ext, compile, custom_objects)
41 try:
42 model_filename = _download_if_url(file_trunk + model_ext)
---> 43 model = keras.models.load_model(model_filename,
44 custom_objects=custom_objects)
45 except (SystemError, ValueError, AttributeError):
46 logging.debug('Failed to load model using keras, likely because it contains custom layers. Will try to init model architecture from code and load weights from _model.h5
into it.', exc_info=False)
File ~\miniconda3\envs\das\lib\site-packages\tensorflow\python\keras\saving\save.py:206, in load_model(filepath, custom_objects, compile, options) 204 filepath = path_to_string(filepath) 205 if isinstance(filepath, str): --> 206 return saved_model_load.load(filepath, compile, options) 208 raise IOError( 209 'Unable to load model. Filepath is not an hdf5 file (or h5py is not ' 210 'available) or SavedModel.')
File ~\miniconda3\envs\das\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py:122, in load(path, compile, options) 117 # TODO(kathywu): Add saving/loading of optimizer, compiled losses and metrics. 118 # TODO(kathywu): Add code to load from objects that contain all endpoints 119 120 # Look for metadata file or parse the SavedModel 121 metadata = saved_metadata_pb2.SavedMetadata() --> 122 meta_graph_def = loader_impl.parse_saved_model(path).meta_graphs[0] 123 object_graph_def = meta_graph_def.object_graph_def 124 path_to_metadata_pb = os.path.join(path, constants.SAVED_METADATA_PATH)
File ~\miniconda3\envs\das\lib\site-packages\tensorflow\python\saved_model\loader_impl.py:118, in parse_saved_model(export_dir) 116 raise IOError("Cannot parse file %s: %s." % (path_to_pbtxt, str(e))) 117 else: --> 118 raise IOError( 119 "SavedModel file does not exist at: %s%s{%s|%s}" % 120 (export_dir, os.path.sep, constants.SAVED_MODEL_FILENAME_PBTXT, 121 constants.SAVED_MODEL_FILENAME_PB))
OSError: SavedModel file does not exist at: normalized_FR.res/20220921_104822_model.h5{saved_model.pbtxt|saved_model.pb}
`
Thanks for giving this a try.
Looks like it just fails to train - loss is nan. Does the same model work when using the regular training?
Something like this?
das.train_tune.train(data_dir='normalized_FR.npy', save_dir='normalized_FR.res', kernel_size=16, nb_epoch=4, nb_filters=32)
I'll test this on my side as well just to make sure.
When running the code provided I had the same issue as before - loss is nan.
das.train_tune.train(data_dir='normalized_FR.npy', save_dir='normalized_FR.res', kernel_size=16, nb_epoch=4, nb_filters=32)
While it didn't work, I could see it was attempting to run multiple trails though. So just out of an interest to understand the code, why did it do this? I thought because we specified an exact value for each argument that it would just run 1 trail. Or is it conducting a random search?
When you said 'regular training' though did you mean das.train.train()
?
das.train.train(data_dir='normalized_FR.npy', save_dir='normalized_FR.res', kernel_size=16, nb_epoch=4, nb_filters=32)
Running this code worked as it should.
Oh, yes, I meant das.train.train
, sorry about the confusion.
The tuning works by running different trials: it fits models with different parameters. Based on the results of the current trial, the most promising set of parameters for the next trial is selected. So it's doing something a bit smarter than a random search.
And I also get nan loss now - probably some change in keras tuner that breaks things in DAS. I'll fix this and let you know once it works.
That's a really nice feature. So the argument inputs we specify just give it a starting point and then it optimizes from there.
Ah ok, thanks for trying it. Sounds good.
I've updated das.tune.train
to now work with the new keras tuner API.
You can give this a try by updating to the latest version 0.28.0 via pip (it's not on conda yet):
pip install das --upgrade --no-deps
Fabulous thank you so much! keep you posted
Got it updated and was able to run it it with a .yml file, providing the specific values I wanted to run test and compare. Seems to be running well and will be helpful for optimization. I find this easier than creating a for loop and I think I will stick with it. Many thanks!
Great - closing this.
Dear Jan and team,
I am using das through Python and would like to fit the network with different configurations to compare performance and find the best model. I see there is a shell script for the command-line on the terminal (janclemenslab: TRAIN) but is there a way to do this with the train function in the das.train module in python?
Additionally, how can I get python to print the training outputs? If I run my training in the GUI, I get a printout like below that includes the f1-scores.
INFO:das.train:{'noise': {'precision': 0.9175990751370318, 'recall': 0.979656218576971, 'f1-score': 0.9476127362477954, 'support': 231717}, 'pulse': {'precision': 0.05627009646302251, 'recall': 0.015928398058252427, 'f1-score': 0.024828564672499408, 'support': 6592}, 'sine': {'precision': 0.8592689767483941, 'recall': 0.5747178602765423, 'f1-score': 0.6887611726526097, 'support': 33051}, 'accuracy': 0.9069243808962264, 'macro avg': {'precision': 0.6110460494494828, 'recall': 0.5234341589705885, 'f1-score': 0.5537341578576348, 'support': 271360}, 'weighted avg': {'precision': 0.8895708148582068, 'recall': 0.9069243808962264, 'f1-score': 0.8936685429716721, 'support': 271360}}
But I don't get this python. I am using the code below with the same files (including test set) and parameters as the GUI. Am I missing an argument to print the results?
model, params = das.train.train(model_name='tcn', data_dir='Quickstart3.npy', save_dir='Quickstart3.res', nb_hist= 256, kernel_size=16, nb_filters=16, batch_size=16, ignore_boundaries=True, verbose=1, nb_epoch=4, log_messages= True)
Any advice is much appreciated!