maxhodak / keras-molecules

Autoencoder network for learning a continuous representation of molecular structures.
MIT License
520 stars 146 forks source link

Getting tables.exceptions.HDF5ExtError: HDF5 error back trace #68

Open vinayakumarr opened 7 years ago

vinayakumarr commented 7 years ago

When I tried to run a program by executing python preprocess.py data/smiles_50k.h5 data/processed.h5. it is generating an error. The detailed error is attached in the image. How to correct this? untitled

alainrichardt commented 7 years ago

Because the files are larger than 50MB, they are stored with git lfs

You need to install git lfs https://git-lfs.github.com/

then run

git lfs get

to download the files

vinayakumarr commented 7 years ago

Now it is giving different error when i tried to run the

sudo python train.py data/processed.h5 model.h5 --epochs 20

Using Theano backend. Traceback (most recent call last): File "train.py", line 65, in main() File "train.py", line 43, in main model.create(charset, latent_rep_size = args.latentdim) File "/home/sachin/vinay/chemistry/keras-molecules/molecules/model.py", line 23, in create , z = self._buildEncoder(x, latent_rep_size, max_length) File "/home/sachin/vinay/chemistry/keras-molecules/molecules/model.py", line 81, in _buildEncoder return (vae_loss, Lambda(sampling, output_shape=(latent_rep_size,), name='lambda')([z_mean, z_log_var])) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 585, in call output = self.call(inputs, kwargs) File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 659, in call return self.function(inputs, arguments) File "/home/sachin/vinay/chemistry/keras-molecules/molecules/model.py", line 68, in sampling epsilon = K.random_normal(shape=(batch_size, latent_rep_size), mean=0., std = epsilon_std) TypeError: random_normal() got an unexpected keyword argument 'std'

alainrichardt commented 7 years ago

This is due to a change in the Keras API, the parameter std has been changed to std_dev

Change the code and submit a pull request :)

vinayakumarr commented 7 years ago

Yes, I had corrected. I think you are using data and label as same in both train and test (in train.py line n0=54). Why? Also, you are giving the testing data as validation data? Is there any separate program to calculate the accuracy on test data set? I want to know whether the code does a classification or prediction?

According to me it is a kind of prediction, am i right?

alainrichardt commented 7 years ago

I'm a lurker in this repo - I dont use the train/test code

pechersky commented 7 years ago

You're right, the latter should be "data_test". In general, "train_gen.py" should be used instead, it should be less demanding on your machine.

I wouldn't call an autoencoder or a VAE as a classification or prediction. Instead, I would call it as representation learning, a la https://hips.seas.harvard.edu/blog/2013/02/04/predictive-learning-vs-representation-learning/

delton137 commented 7 years ago

For the record, if you are using the latest version of TensorFlow with Keras, the API has changed std => stddev

dtchang commented 7 years ago

One way to resolve the exception is to checkout / download / replace the data files.

vinayakumarr commented 6 years ago

Getting an error, when I tried to run

python preprocess.py data/smiles_500k.h5 data/processed_500.h5

File "preprocess.py", line 85, in main() File "preprocess.py", line 72, in main apply_fn=lambda ch: np.array(map(one_hot_encoded_fn, File "preprocess.py", line 63, in create_chunk_dataset chunks=tuple([chunk_size]+list(dataset_shape[1:]))) File "/home/vinay/chemistrytensor/local/lib/python2.7/site-packages/h5py/_hl/group.py", line 105, in create_dataset dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) File "/home/vinay/chemistrytensor/local/lib/python2.7/site-packages/h5py/_hl/dataset.py", line 76, in make_new_dset if isinstance(chunks, tuple) and (-numpy.array([ i>=j for i,j in zip(tmp_shape,chunks) if i is not None])).any(): TypeError: The numpy boolean negative, the - operator, is not supported, use the ~ operator or the logical_not function instead. untitled