maxhodak / keras-molecules

Autoencoder network for learning a continuous representation of molecular structures.
MIT License
519 stars 146 forks source link

Problems with the model_500k.h5 #58

Open clin366 opened 7 years ago

clin366 commented 7 years ago

In order to test the model_500k.h5 in the data folder.

I ran the scripts as below: python2.7 preprocess.py data/smiles_500k.h5 data/processed_500.h5;

python2.7 sample.py data/processed_500.h5 data/model_500k.h5 --target autoencoder

however, the second step provides me with error message as following:

    main()   File "sample.py", line 90, in main     autoencoder(args, model)   File "sample.py", line 44, in autoencoder     model.load(charset, args.model, latent_rep_size = latent_dim)   File "/Users/flynn/Documents/desktop/GT_second_semester/song_lab/nanoparticle_research/molecule_BO/keras_molecule/keras-molecules/molecules/model.py", line 95, in load     self.create(charset, weights_file = weights_file, latent_rep_size = latent_rep_size)   File "/Users/flynn/Documents/desktop/GT_second_semester/song_lab/nanoparticle_research/molecule_BO/keras_molecule/keras-molecules/molecules/model.py", line 50, in create     self.autoencoder.load_weights(weights_file)   File "/usr/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2500, in load_weights     self.load_weights_from_hdf5_group(f)   File "/usr/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2585, in load_weights_from_hdf5_group     K.batch_set_value(weight_value_tuples)   File "/usr/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 990, in batch_set_value     assign_op = x.assign(assign_placeholder)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 575, in assign     return state_ops.assign(self._variable, value, use_locking=use_locking)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign     use_locking=use_locking, name=name)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op     op_def=op_def)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2242, in create_op     set_shapes_for_outputs(ret)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1617, in set_shapes_for_outputs     shapes = shape_func(op)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1568, in call_with_requiring     return call_cpp_shape_fn(op, require_shape_fn=True)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn     debug_python_shape_fn, require_shape_fn)   File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl     raise ValueError(err.message)

ValueError: Dimension 2 in both shapes must be equal, but are 56 and 55 for 'Assign' (op: 'Assign') with input shapes: [9,1,56,9], [9,1,55,9].

Is this model_500k.h5 not proper for smiles_500k.h5? If so, how could I make use of model_500k.h5?

evanfeinberg commented 7 years ago

+1 to @clin366 . I have literally the same issue and would greatly appreciate advice of the developer on this.

pechersky commented 7 years ago

I am able to load the weights without issue, after freshly downloading both the smiles_500 and model_500 h5 files.

Can you do me a favor, and run the following:

data = pandas.read_hdf('data/smiles_500k.h5', 'table')
setdata = data['structure'].apply(set)
print len(set.union(*setdata.values))
clin366 commented 7 years ago

@pechersky hi, thanks! I ran the codes and the result is: Type "help", "copyright", "credits" or "license" for more information.

import pandas data = pandas.read_hdf('data/smiles_500k.h5', 'table') setdata = data['structure'].apply(set) print len(set.union(*setdata.values)) 55

In fact, after preprocess step, I used H5VIEW to check the charset length, which seems all right ( 0~55, 0 for blank and 55 for "u"). I'm not sure where the mistake is from. However, I'm running the project on Mac Pro and using the tensor flow( CPU version), maybe this causes some problem?

clin366 commented 7 years ago

hi~ I have solved this issue by running this project on ubuntu system with tensorflow-gpu. However, the result of python2.7 sample.py data/processed_500.h5 data/model_500k.h5 --target autoencoder is as below: COc1cc(OC)c2C(=O)\C(=C\c3ccc(cc3)c4ccncc4)\Oc2c1 NSA@H[(SM++[.SNg+[.SAA@HHHHHHHHHSSAAHHHH+HSNN)[NN

In my understanding, the first line is the input chemical sequence, I'm not sure what the second line represent... It seems strange for me...

hsiaoyi0504 commented 7 years ago

I think it's the output. It looks like a garbage output, but it does consist of character set. You can look the issue I invoke in #54

zalperst commented 7 years ago

I am getting a similar error, would someone comment on this, i am using tensorflow-gpu on linux, I have experimented with different versions but still am getting the error:

Traceback (most recent call last): File "sample.py", line 97, in main() File "sample.py", line 90, in main autoencoder(args, model) File "sample.py", line 44, in autoencoder model.load(charset, args.model, latent_rep_size = latent_dim) File "/home/zalperstein/maxhodak_harvard/keras-molecules/molecules/model.py", line 95, in load self.create(charset, weights_file = weights_file, latent_rep_size = latent_rep_size) File "/home/zalperstein/maxhodak_harvard/keras-molecules/molecules/model.py", line 50, in create self.autoencoder.load_weights(weights_file) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2538, in load_weights load_weights_from_hdf5_group(f, self.layers) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/keras/engine/topology.py", line 2970, in load_weights_from_hdf5_group K.batch_set_value(weight_value_tuples) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2148, in batch_set_value assign_op = x.assign(assign_placeholder) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 512, in assign return state_ops.assign(self._variable, value, use_locking=use_locking) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/ops/state_ops.py", line 270, in assign validate_shape=validate_shape) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign use_locking=use_locking, name=name) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op op_def=op_def) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2338, in create_op set_shapes_for_outputs(ret) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1719, in set_shapes_for_outputs shapes = shape_func(op) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1669, in call_with_requiring return call_cpp_shape_fn(op, require_shape_fn=True) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn debug_python_shape_fn, require_shape_fn) File "/home/zalperstein/maxhodak_harvard/maxhodack_env/local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 676, in _call_cpp_shape_fn_impl raise ValueError(err.message) ValueError: Dimension 1 in both shapes must be equal, but are 51 and 55 for 'Assign' (op: 'Assign') with input shapes: [9,51,9], [9,55,9].

pechersky commented 7 years ago

As far as I can tell, the model_500k.h5 that is in the data is older than the current preprocess code. I'd suggest trying sample_gen.py directly from the smiles datafiles. I'd try it and let you know, but I'm having issues with Theano default value settings.

zalperst commented 7 years ago

Same error almost

python sample_gen.py data/smiles_50k.h5 data/model_500k.h5 --target autoencoder Using TensorFlow backend. I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties: name: Quadro K2100M major: 3 minor: 0 memoryClockRate (GHz) 0.6665 pciBusID 0000:01:00.0 Total memory: 1.95GiB Free memory: 1.32GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro K2100M, pci bus id: 0000:01:00.0) Traceback (most recent call last): File "sample_gen.py", line 127, in main() File "sample_gen.py", line 120, in main autoencoder(args, model) File "sample_gen.py", line 55, in autoencoder model.load(datobj.chars, args.model, latent_rep_size = latent_dim) File "/home/zalperstein/maxhodak_harvard/keras-molecules/molecules/model.py", line 95, in load self.create(charset, weights_file = weights_file, latent_rep_size = latent_rep_size) File "/home/zalperstein/maxhodak_harvard/keras-molecules/molecules/model.py", line 50, in create self.autoencoder.load_weights(weights_file, by_name = True) File "/home/zalperstein/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2498, in load_weights self.load_weights_from_hdf5_group_by_name(f) File "/home/zalperstein/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 2633, in load_weights_from_hdf5_group_by_name K.batch_set_value(weight_value_tuples) File "/home/zalperstein/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 990, in batch_set_value assign_op = x.assign(assign_placeholder) File "/home/zalperstein/.local/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 505, in assign return state_ops.assign(self._variable, value, use_locking=use_locking) File "/home/zalperstein/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 45, in assign use_locking=use_locking, name=name) File "/home/zalperstein/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op op_def=op_def) File "/home/zalperstein/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2382, in create_op set_shapes_for_outputs(ret) File "/home/zalperstein/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1783, in set_shapes_for_outputs shapes = shape_func(op) File "/home/zalperstein/.local/lib/python2.7/site-packages/tensorflow/python/framework/common_shapes.py", line 596, in call_cpp_shape_fn raise ValueError(err.message) ValueError: Dimension 2 in both shapes must be equal, but are 56 and 55

qqqqqqq007 commented 7 years ago

I'm having the same problem as @zalperst. Please help! Thank you! I tried sample.py, sample_gen.py. Same errors appeared. I also tried trained models other than the sample model_500k.h5. Still same error.

ValueError: Dimension 1 in both shapes must be equal, but are 56 and 54 for 'Assign' (op: 'Assign') with input shapes: [9,56,9], [9,54,9].