tensorflow / model-optimization

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
https://www.tensorflow.org/model_optimization
Apache License 2.0
1.49k stars 319 forks source link

prune_low_magnitude fail when wrapped with GRU layer, but success with conv2d and dense #944

Open didadida-r opened 2 years ago

didadida-r commented 2 years ago

Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. For example, the overview page of quantization-aware training is here. An issue for anything not supported should be a feature request.

Describe the bug A clear and concise description of what the bug is.

System information

TensorFlow version (installed from source or binary): 2.7.0

TensorFlow Model Optimization version (installed from source or binary): 0.7.1

Python version: 3.8

Describe the expected behavior

Describe the current behavior Hi, i can use the prune_low_magnitude for Dense and Conv2D. And When use the prune_low_magnitude, it failed when wrapped with GRU.

keras.layers.GRU(units=numUnits, 
unroll=False,
return_sequences=True, 
recurrent_activation='sigmoid',
return_state=False)

Also, when i remove the ModelCheckpoint callback, it can run with GRU, but i cannot save the model.

checkpoint_callback = ModelCheckpoint(os.path.join(exp_dir, 'model_{epoch:02d}.h5'),
                                       monitor='val_loss',
                                       save_best_only=False,
                                       save_weights_only=False,
                                       mode='auto',
                                       save_freq='epoch')

Code to reproduce the issue Provide a reproducible code that is the bare minimum necessary to generate the problem.

Screenshots If applicable, add screenshots to help explain your problem.

2022-03-15 15:41:00.249444: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8101
2022-03-15 15:41:00.626781: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
10/10 [==============================] - ETA: 0s - loss: 867.6863 - ola_layer_loss: 6.6291 - mask_layer_loss: 16.1240 - vad_output_loss: 1.0971/home/test/dev/anaconda3/envs/tf2.7_py3.8/lib/python3.8/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  layer_config = serialize_layer_fn(layer)
Traceback (most recent call last):
  File "dpcrn/bin/train.py", line 97, in <module>
    main()
  File "dpcrn/bin/train.py", line 87, in main
    excuctor.train(loss_wrapper=loss_wrapper, 
  File "/home/test/code/speech_enhance/DPCRN_DNS3/dpcrn/utils/executor.py", line 136, in train
    self.model.fit_generator(data_generator.generator(batch_size=batch_size, validation=False), 
  File "/home/test/dev/anaconda3/envs/tf2.7_py3.8/lib/python3.8/site-packages/keras/engine/training.py", line 2016, in fit_generator
    return self.fit(
  File "/home/test/dev/anaconda3/envs/tf2.7_py3.8/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/test/dev/anaconda3/envs/tf2.7_py3.8/lib/python3.8/site-packages/h5py/_hl/group.py", line 149, in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
  File "/home/test/dev/anaconda3/envs/tf2.7_py3.8/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 142, in make_new_dset
    dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 87, in h5py.h5d.create
ValueError: Unable to create dataset (name already exists)
2022-03-15 15:41:04.771988: W tensorflow/core/kernels/data/generator_dataset_op.cc:107] Error occurred when finalizing GeneratorDataset iterator: FAILED_PRECONDITION: Python interpreter state is not initialized. The process may be terminated.
         [[{{node PyFunc}}]]

Additional context Add any other context about the problem here.