dvgodoy / deepreplay

Deep Replay - Generate visualizations as in my "Hyper-parameters in Action!" series!
https://towardsdatascience.com/hyper-parameters-in-action-a524bf5bf1c
MIT License
270 stars 48 forks source link

ValueError: Unable to create group (name already exists) problem #28

Open clevilll opened 2 years ago

clevilll commented 2 years ago

Hi recently I faced a problem when I used this package as its Traceback is following:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-43-c3b5d8180301> in <module>()
----> 1 model.fit(X, y, epochs=50, batch_size=16, callbacks=[replay])

2 frames
/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/deepreplay/callbacks.py in on_train_begin(self, logs)
     83         self.n_epochs = self.params['epochs']
     84 
---> 85         self.group = self.handler.create_group(self.group_name)
     86         self.group.attrs['samples'] = self.params['samples']
     87         self.group.attrs['batch_size'] = self.params['batch_size']

/usr/local/lib/python3.7/dist-packages/h5py/_hl/group.py in create_group(self, name, track_order)
     63             name, lcpl = self._e(name, lcpl=True)
     64             gcpl = Group._gcpl_crt_order if track_order else None
---> 65             gid = h5g.create(self.id, name, lcpl=lcpl, gcpl=gcpl)
     66             return Group(gid)
     67 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5g.pyx in h5py.h5g.create()

ValueError: Unable to create group (name already exists)

I shared the google colab notebook which I could run on Aug 2, 2019, but now it threw out this KeyError: 'samples' in 2021. Please check the notebook and feel free to run it for quick debugging.

Following is the configuration and package versions in google colab:

matplotlib==3.2.2
matplotlib-inline==0.1.3
matplotlib-venn==0.11.6
numpy==1.19.5
pandas==1.1.5
pandas-datareader==0.9.0
pandas-gbq==0.13.3
pandas-profiling==1.4.1
scikit-learn==1.0.1
scipy==1.4.1
seaborn==0.11.2
sklearn-pandas==1.8.0
3.7.12
Python 3.7.12

This is the full code:

from keras.models import Sequential
from keras.layers import Dense
#from keras.optimizers import SGD
from tensorflow.keras.optimizers import SGD
from keras.initializers import glorot_normal, normal

model = Sequential()
model.add(Dense(input_dim=2,
                units=2,
                activation='sigmoid',
                kernel_initializer=glorot_normal(seed=42),
                name='hidden'))
model.add(Dense(units=1,
                activation='sigmoid',
                kernel_initializer=normal(seed=42),
                name='output'))

model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.05), metrics=['acc'])

from deepreplay.callbacks import ReplayData
from deepreplay.datasets.parabola import load_data
from deepreplay.replay import Replay

X, y = load_data()

replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1')
model.fit(X, y, epochs=50, batch_size=16, callbacks=[replay])

I found the similar open issues here & here related to saving model via hdf5 file. I tried based on some suggestions 12195# and here to save the model with tfinstead of h5 in replay_filename which was unsuccessful. There is lots of post in this regard in SoF Any helps will be highly appreciated.

dvgodoy commented 2 years ago

Hi,

If you run the code multiple times, it raises this error, because it doesn't overwrite the data inside a given group contained in the H5 file. So, you have two options:

Best, Daniel

clevilll commented 2 years ago

Hi,

If you run the code multiple times, it raises this error, because it doesn't overwrite the data inside a given group contained in the H5 file. So, you have two options:

  • delete the .h5 file and run it again
  • change the group name every time you run the same code (replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1_001') Hopefully this will solve your issue.

Best, Daniel

Thanks for reply. I picked the 2nd option and replace following scripts to dynamically change the name when I run the cell:

import datetime as dt
dtime = dt.time()
now = dt.datetime.now()
zeit = now.strftime("%Y-%m-%d %H:%M:%S")
print(f"Last update of notebook:{zeit}")

X, y = load_data()

#replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1')
replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name=f'part1_{zeit}')

but still, those errors mentioned in the issue remain and couldn't figure out what's wrong when I run them in Google colab notebook.

clevilll commented 2 years ago

@dvgodoy is there any update from your side?

dvgodoy commented 2 years ago

Hi,

The error you got Key Error: 'samples' is the same one from issue #29, please follow the instructions I wrote there to downgrade some of the packages, then restart the kernel, and you should be able to run it.

Hope it helps. Best, Daniel

clevilll commented 2 years ago

Hi, still can't solve it after downgrading. Please see the Google colab notebook.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-11-13c5c2ef98eb>](https://localhost:8080/#) in <module>
     17                 name='output'))
     18 # Compile the model
---> 19 model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.05), metrics=['acc'])
     20 
     21 from deepreplay.callbacks import ReplayData

1 frames
[/usr/local/lib/python3.7/dist-packages/keras/engine/training.py](https://localhost:8080/#) in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, weighted_metrics, target_tensors, **kwargs)
     92                 `optimizer`, `loss`, `metrics` or `sample_weight_mode`.
     93         """
---> 94         self.optimizer = optimizers.get(optimizer)
     95         self.loss = loss or []
     96         self.metrics = metrics or []

[/usr/local/lib/python3.7/dist-packages/keras/optimizers.py](https://localhost:8080/#) in get(identifier)
    766     else:
    767         raise ValueError('Could not interpret optimizer identifier: ' +
--> 768                          str(identifier))

ValueError: Could not interpret optimizer identifier: <tensorflow.python.keras.optimizer_v2.gradient_descent.SGD object at 0x7fb0ce9413d0>