Closed alekskorupa closed 6 years ago
Hi Alek,
I built this using some version between r0.10 and r0.11, because I forked the code from the first version of Tensorflow's im2txt repo. However, TF developers have moved this repo into a new research folder and make it very hard to track the initial version. (https://github.com/tensorflow/models/commit/f87a58cd96d45de73c9a8330a06b2ab56749a7fa#comments)
By checking my TF repo, the version I used is v0.10.0-1705-g6218ac2, but I could imagine that it would be very hard for you to find this version and install it from source. So can you try if Tensorflow r0.11 works? If you can run the code, r0.10 and r0.11 should have very similar performance.
Let me know if r0.11 works for you.
Thanks, Xintong
Hi Xintong,
Thank you for your quick response. Yes, I was suspecting that r0.11 could work, but after trying it I have a new error, again due to missing module
. train.sh
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:3448] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
INFO:tensorflow:Prefetching values from 128 files matching data/tf_records/train-no-dup-?????-of-00128
Traceback (most recent call last):
File "polyvore/train.py", line 111, in <module>
tf.app.run()
File "/home/oleks/anaconda/envs/biLSTM_old/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "polyvore/train.py", line 66, in main
model.build()
File "/mnt/datasets/polyvore-lstm/polyvore/polyvore_model_bi.py", line 676, in build
self.build_inputs()
File "/mnt/datasets/polyvore-lstm/polyvore/polyvore_model_bi.py", line 217, in build_inputs
images.append(self.process_image(encoded_images[i],image_idx=i))
File "/mnt/datasets/polyvore-lstm/polyvore/polyvore_model_bi.py", line 159, in process_image
image_idx=image_idx)
File "/mnt/datasets/polyvore-lstm/polyvore/ops/image_processing.py", line 82, in process_image
image_summary("original_image/" + str(image_idx), image)
File "/mnt/datasets/polyvore-lstm/polyvore/ops/image_processing.py", line 71, in image_summary
tf.summary.image(name, tf.expand_dims(image, 0))
AttributeError: 'module' object has no attribute 'image'
I have also tried r0.12.1 with similar error as a result
. train.sh
INFO:tensorflow:Prefetching values from 128 files matching data/tf_records/train-no-dup-?????-of-00128
Traceback (most recent call last):
File "polyvore/train.py", line 111, in <module>
tf.app.run()
File "/home/oleks/anaconda/envs/biLSTM_old/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 43, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "polyvore/train.py", line 66, in main
model.build()
File "/mnt/datasets/polyvore-lstm/polyvore/polyvore_model_bi.py", line 679, in build
self.build_model()
File "/mnt/datasets/polyvore-lstm/polyvore/polyvore_model_bi.py", line 377, in build_model
tf.losses.add_loss(emb_batch_loss * self.config.emb_loss_factor)
AttributeError: 'module' object has no attribute 'losses'
I have actually manage to train the model using higher version (tensorflow 1.3). Now having the model weights I miss the code for inference. In particular, I need to perform multimodal (image + text) query for item retrieval. If you think you have the working version available, I would appreciate if you share it somewhere. Thanks again for your help.
Best regards,
Aleksander
Are you the current version of my code? Since I am using tf.image_summary not tf.summary.image in image_processing.py.
I will check how to make it run under r0.11 by the end of this week and let you know how to do it.
Hi again,
First of all, sorry for not replying for so long, I had a busy week last week. Yes, it might actually be the case that I was running some other than current version of your code. Anyway, after some modifications, I have managed to run extract_features script using TensorFlow 1.3.
Now, I would like to perform multimodal query to generate outfit like in your paper, but first, I guess I need to extract a semantic representation of the text query based on the trained embedding. Would you be able to tell me how to do that?
Best regards,
Aleksander
model.embedding_map contains the embedding of each word in the vocabulary.
[word_emb] = sess.run([model.embedding_map])
If you want to get the representation of a text query containing words a, b, c, you just need to feed the indices of a, b, c and average their embeddings:
def norm_row(a):
try:
return a / np.linalg.norm(a, axis=1)[:, np.newaxis]
except:
return a / np.linalg.norm(a)
words = open('word_dict.txt').read().splitlines()
query = 'a b c'
query = [i+1 for i in range(len(words)) if words[i] in query.split()]
query_emb = norm_row(np.sum(word_emb[query],axis=0))
I will try that. Thanks a lot for all your help.
Best wishes,
Aleksander
Hi @alekskorupa I'm trying to run extract_features script using TensorFlow 1.13.I have modified the original code and trained the model, and I got .meta,.index and .data checkpoint files. when I try to extract_features(in original code),I got a lot of NotFoundErr like this:
Caused by op 'save/RestoreV2', defined at:
File "E:/liurui/polyvore-master/polyvore/run_inference.py", line 93, in <module>
tf.app.run()
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "E:/liurui/polyvore-master/polyvore/run_inference.py", line 55, in main
saver = tf.train.Saver()
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 832, in __init__
self.build()
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 844, in build
self._build(self._filename, build_save=True, build_restore=True)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 881, in _build
build_save=build_save, build_restore=build_restore)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 513, in _build_internal
restore_sequentially, reshape)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 332, in _AddRestoreOps
restore_sequentially)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 580, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1572, in restore_v2
name=name)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op
op_def=op_def)
File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
Key lstm/BW/basic_lstm_cell/bias not found in checkpoint
[[node save/RestoreV2 (defined at E:/liurui/polyvore-master/polyvore/run_inference.py:55) ]]
Could you tell me some references materials for how to modify these codes to run on the new tensorflow? Thank you very much, Rui Liu
Hi @alekskorupa I'm trying to run extract_features script using TensorFlow 1.13.I have modified the original code and trained the model, and I got .meta,.index and .data checkpoint files. when I try to extract_features(in original code),I got a lot of NotFoundErr like this:
Caused by op 'save/RestoreV2', defined at: File "E:/liurui/polyvore-master/polyvore/run_inference.py", line 93, in <module> tf.app.run() File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "E:/liurui/polyvore-master/polyvore/run_inference.py", line 55, in main saver = tf.train.Saver() File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 832, in __init__ self.build() File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 513, in _build_internal restore_sequentially, reshape) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 332, in _AddRestoreOps restore_sequentially) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1572, in restore_v2 name=name) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op op_def=op_def) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key lstm/BW/basic_lstm_cell/bias not found in checkpoint [[node save/RestoreV2 (defined at E:/liurui/polyvore-master/polyvore/run_inference.py:55) ]]
Could you tell me some references materials for how to modify these codes to run on the new tensorflow? Thank you very much, Rui Liu
have you solved this question?I have met same question when I run the code.
I'm trying to run extract_features script using TensorFlow 0.10 ,but I get following question ,can you help me solve my problem? thank you very much!
Traceback (most recent call last):
File "/media/公共硬盘A/ZhangJ/polyvore-master/polyvore/run_inference.py", line 103, in
Hi @alekskorupa I'm trying to run extract_features script using TensorFlow 1.13.I have modified the original code and trained the model, and I got .meta,.index and .data checkpoint files. when I try to extract_features(in original code),I got a lot of NotFoundErr like this:
Caused by op 'save/RestoreV2', defined at: File "E:/liurui/polyvore-master/polyvore/run_inference.py", line 93, in <module> tf.app.run() File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "E:/liurui/polyvore-master/polyvore/run_inference.py", line 55, in main saver = tf.train.Saver() File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 832, in __init__ self.build() File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 844, in build self._build(self._filename, build_save=True, build_restore=True) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 881, in _build build_save=build_save, build_restore=build_restore) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 513, in _build_internal restore_sequentially, reshape) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 332, in _AddRestoreOps restore_sequentially) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\training\saver.py", line 580, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\ops\gen_io_ops.py", line 1572, in restore_v2 name=name) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\ops.py", line 3300, in create_op op_def=op_def) File "C:\dlfiles\Anaconda36\lib\site-packages\tensorflow\python\framework\ops.py", line 1801, in __init__ self._traceback = tf_stack.extract_stack() NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key lstm/BW/basic_lstm_cell/bias not found in checkpoint [[node save/RestoreV2 (defined at E:/liurui/polyvore-master/polyvore/run_inference.py:55) ]]
Could you tell me some references materials for how to modify these codes to run on the new tensorflow? Thank you very much, Rui Liu
Can I have your contact information please?I would like to communicate with you about the problem,Thank you.
Hi, the name of LSTM weights between two versions are different. You may check them name by digging into the graph file. Hope this can help you: https://github.com/KranthiGV/Pretrained-Show-and-Tell-model/issues/7
Thank you very much!!Have a nice weekend.❤️
张景 | |
---|---|
邮箱:18202673958@163.com |
签名由 网易邮箱大师 定制
On 09/21/2019 20:44, Xintong Han wrote:
Hi, the name of LSTM weights between two versions are different. You may check them name by digging into the graph file. Hope this can help you: KranthiGV/Pretrained-Show-and-Tell-model#7
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Hi, I am writing as the following problem occurred when I tried to use your code.
It seems that this module is not available until tensorflow version 0.11... are you positive that 0.10 is the one that works? Thanks for help.
Best, Aleksander