Open lamhoangtung opened 5 years ago
Quick update, I've just found out that you guys used tf.image.decode_jpeg
and tf.image.resize_images
instead of OpenCV, I switched to it, the output result is different but still not the same as your code.
Am I missing something like normalization ?? Here is what I've changed:
path = tf.placeholder(tf.string)
image_encoded = tf.read_file(path)
image_decoded = tf.image.decode_jpeg(image_encoded, channels=3)
image_resized = tf.image.resize_images(image_decoded, net_input_size)
img = tf.expand_dims(image_resized, axis=0)
Thanks ;)
The only thing that comes to mind right now, is that by the default we use test time augmentation, which you don't. But that depends on how you are using our embed script to create comparable embeddings in this'll case.
On Mon, Jun 3, 2019, 10:20 Hoàng Tùng Lâm (Linus) notifications@github.com wrote:
Quick update, I've just found out that you guys used tf.image.decode_jpeg and tf.image.resize_images instead of OpenCV, I switched to it, the output result is different but still not the same as your code.
Am I missing something like normalization ?? Here is what I've changed:
path = tf.placeholder(tf.string) image_encoded = tf.read_file(path) image_decoded = tf.image.decode_jpeg(image_encoded, channels=3) image_resized = tf.image.resize_images(image_decoded, net_input_size) img = tf.expand_dims(image_resized, axis=0)
Thanks ;)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/81?email_source=notifications&email_token=AAOJDTKVVJNQMXNIJVCBXJLPYTH5ZA5CNFSM4HSFMPL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWYVD7I#issuecomment-498160125, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOJDTKPBI5V64CRLIFKBADPYTH5ZANCNFSM4HSFMPLQ .
Hi @Pandoro, Thanks for the quick response. This is what I use to compute the embedding vector:
python3 embed.py \
--experiment_root ... \
--dataset ... \
--filename ...
I extracted the vector from the .h5
file.
Anyways, how can I do TTA in my case? Are there any code in your repo I can reference?
If you use it like that, it should actually not be doing any test time augmentation, so that shouldn't be it either. The code to do so is included in embed.py. The only thing that comes to mind is that maybe something goes wrong during extracting of the embedding? Have you tried creating a csv file only containing the one image you want to embed?
On Mon, Jun 3, 2019, 12:18 Hoàng Tùng Lâm (Linus) notifications@github.com wrote:
Hi @Pandoro https://github.com/Pandoro, Thanks for the quick response. This is what I use to compute the embedding vector:
python3 embed.py \ --experiment_root ... \ --dataset ... \ --filename ...
I extracted the vector from the .h5 file.
Anyways, how can I do TTA in my case? Are there in code in your repo I can reference?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/81?email_source=notifications&email_token=AAOJDTJH7MYEI2SRQ56BOZDPYTVZBA5CNFSM4HSFMPL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWY6Y2A#issuecomment-498199656, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOJDTMMDIW4NTMPXFPC533PYTVZBANCNFSM4HSFMPLQ .
Hi. I did an experiments with a csv file contain only the image that I want to embed and found something really strange. Actually there might be nothing wrong with you guys’s embed code and my inference code.
I haven't seen this before. I wouldn't be surprised if there are tiny differences, but we frequently used CPUs to embed and evaluate stuff when all GPUs were busy and that worked fine. So something seems to be wrong. Are you using the same tensorflow version for both CPU and GPU?
On Mon, Jun 3, 2019, 13:40 Hoàng Tùng Lâm (Linus) notifications@github.com wrote:
Hi. I did an experiments with a csv file contain only the image that I want to embed and found something really strange. Actually there might be nothing wrong with you guys’s embed code and my inference code.
- The h5 output file that I previously use for comparison was created on a remote server with GPU enabled.
- My inference code was run on my local machine which only have CPU. After I try to compute everything again only on my CPU, I found that there are a big difference on the embedded vector computed by GPU vs CPU. (My code and yours produce exactly the same results)
- Note that the difference are HUGE, like completely different. I did double check the model, code and input images for the experiment Have you ever seen something like this ? Am I wrong at some point ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/81?email_source=notifications&email_token=AAOJDTK52KRGKK5T7QGG75LPYT7KDA5CNFSM4HSFMPL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWZEJTQ#issuecomment-498222286, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOJDTP6DNWMJRNBOUJG3OLPYT7KDANCNFSM4HSFMPLQ .
@Pandoro Same tensorflow 1.12.0 on both machine
Some update on this. I tried to redo everything, even training and here is the result:
[ 0.1475507 0.26669884 -0.10536072 -0.7495441 -0.05301389 -0.12123938
-0.2105978 0.34713405 -0.06077751 0.38768452 0.46736327 -0.14455695
-0.13443749 0.4708902 -0.53196555 -0.4674694 0.4387072 -0.01120797
0.03252156 0.11937858 0.03637908 -0.23512752 -0.087494 0.40861905
0.39684698 -0.25528368 0.53282946 -0.7992279 -0.04100448 0.607317
0.37891495 -0.43027154 -0.09188752 -0.31797376 0.2922396 0.3039867
-0.21458632 -0.40264758 0.01471368 0.14217973 0.29642326 -0.33412308
0.61750454 0.02563823 -0.4100364 -0.4894322 -0.33408296 -0.30945992
-0.03018434 0.06986241 -0.3707401 -0.1222352 0.19458997 -0.11415277
-0.04913341 -0.0650656 -0.23189925 -0.3081076 -0.04566643 0.56977797
0.1199189 -0.25228524 -0.10953259 0.5716973 0.07392599 -0.1805463
0.03953229 0.12185388 -0.15962987 -0.21938688 -0.05884064 0.34342512
0.26555967 0.21485685 0.3734443 -0.19710182 -0.4279406 0.23197423
-0.27009133 0.30459598 -0.37105414 0.4993727 0.1789047 0.04352051
-0.16855955 -0.6482116 -0.1902902 -0.02592199 -0.00989667 0.5478813
0.3826628 -0.33704245 0.3876207 -0.39746612 -0.4097886 0.14956611
0.03482605 -0.27635813 0.05575407 -0.26498005 -0.19787493 -0.22036389
0.21582448 0.46559668 -0.41869876 0.12922227 0.0621463 0.01098646
0.06490406 0.35996896 0.21602859 -0.34911785 -0.18451497 0.05639197
0.04268607 -0.072242 -0.23873544 -0.09557254 0.03791614 -0.19931975
-0.07070286 0.09722421 0.29151836 -0.02433551 0.2241952 -0.96187866
0.13102485 0.00164846]
.csv
file, contain only one sample duplicated 100 times: Same for 100 output vector, same for GPU and CPU and same as above.csv
file, which have the first sample use for above experiment: Same for GPU and CPU, but not same as above:
[ 4.46426451e-01 2.67341495e-01 -3.03951055e-01 -1.09888956e-01
1.48094699e-01 1.09376453e-01 3.18785965e-01 -2.31513470e-01
9.18060988e-02 9.47581697e-03 -3.14935297e-01 -5.06232917e-01
2.13361338e-01 5.70732616e-02 5.59608713e-02 -2.04994321e-01
-7.14561269e-02 4.35655147e-01 4.42430824e-01 -1.19181640e-01
-9.79143828e-02 3.38607967e-01 -8.01632106e-02 8.19585398e-02
3.10744733e-01 -5.10766864e-01 3.90632376e-02 3.73192802e-02
-2.21006293e-02 1.50721356e-01 3.10757637e-01 -1.00263797e-01
-3.67254391e-02 3.62346590e-01 -2.23815039e-01 -4.09024119e-01
-7.41786659e-01 -2.77244627e-01 -6.83265150e-01 -3.71105620e-04
3.62792283e-01 -3.34418714e-01 4.02492136e-01 2.93934852e-01
5.06364256e-02 1.14161275e-01 -1.49569120e-02 2.07622617e-01
9.04084072e-02 2.35464871e-01 1.60102062e-02 -1.07340008e-01
-6.13746643e-01 -1.84301529e-02 -3.65158543e-02 -2.17433404e-02
4.48067039e-01 3.31106067e-01 2.05742702e-01 -1.24085128e-01
2.07252398e-01 -5.85925281e-01 -2.59883493e-01 2.63391703e-01
-3.12482953e-01 -1.48463324e-01 -2.19984993e-01 3.31126675e-02
1.76012367e-01 3.09261560e-01 -1.59823354e-02 1.53631851e-01
1.53570157e-02 -2.29165092e-01 3.28389913e-01 -2.26212129e-01
-3.93793285e-01 -1.54186189e-01 -4.85752940e-01 1.30166719e-02
-5.14035374e-02 -1.77116096e-01 9.73375281e-05 -2.54578739e-02
3.99445705e-02 4.45321977e-01 2.78115660e-01 -1.51245281e-01
-3.03700745e-01 -3.81025001e-02 1.43309757e-01 -6.55035377e-01
8.83019418e-02 -3.06550767e-02 -4.80769187e-01 4.71787043e-02
5.49029335e-02 -1.17088296e-01 3.43144536e-01 -7.30120242e-02
-3.58440757e-01 -1.66995618e-02 -3.06979388e-01 5.11138923e-02
1.75048336e-01 -1.83060188e-02 -3.81746352e-01 -6.02350771e-01
-3.84051464e-02 5.41097879e-01 2.33160406e-01 8.10048282e-02
-4.97415751e-01 -3.47296298e-02 -8.40142891e-02 2.04959571e-01
6.48377165e-02 -1.64840698e-01 1.98047027e-01 1.82637498e-01
-9.53407511e-02 2.63416976e-01 -1.82583451e-01 -3.99179049e-02
2.82630742e-01 -6.65262759e-01 -5.13938844e-01 -1.60764366e-01]
Where can I potentially be wrong ?. Here is how I extract the vector out of the h5
file:
import h5py
import numpy as np
raw_embedding = h5py.File('....h5', 'r')
raw_label = pd.read_csv('...csv')
def load_data():
features = raw_embedding['emb'].value
labels = list(raw_label.iloc[:, 1])
return (features, labels)
vecs, imgs = load_data()
print(vecs[0], imgs[0])
Thanks for your help @Pandoro
Note: I tried a bunch of different images, so the problem is not related to the first sample of the dataset only. => Question: Did you guys do any datasets level normalization ?
I can't say that this sounds like anything I've seen before. If I get it right, GPU and CPU results are now the same, but it depends on if you have several other images in your batch or just one specific one?
It sounds like something might be going wrong with the batch normalization, but your script you clearly sets is_training=False. We don't do any other normalization, so I honestly have no idea where this could be coming from.
On Tue, Jun 4, 2019 at 5:28 AM Hoàng Tùng Lâm (Linus) < notifications@github.com> wrote:
Note: I tried a bunch of different images, so the problem is not related to the first sample of the dataset only. => Question: Did you guys do any datasets level normalization ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/81?email_source=notifications&email_token=AAOJDTKBVA6VCMYDLA3ZBNLPYXON7A5CNFSM4HSFMPL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW3JVDY#issuecomment-498506383, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOJDTLAXHIJDN24RJTKVCTPYXON7ANCNFSM4HSFMPLQ .
So which one should I use ? Which one is more accurate ? Should I create a fake batches ? Or should I keep the batch_size = 1 when inference ?
There is no useful answer to that question. What you are seeing shouldn't be happening. Currently I don't have time to investigate if this is an issue with our code, but I highly doubt it since we haven't seen any such issues so far.
As it is right now, your setup seems to be somehow broken and thus there is no "more accurate".
What you could do is to try and download our pretrained model and run the evaluation on Market-1501 to see if you can recreate our original scores. If you get a different score, something else is broken.
On Tue, Jun 4, 2019 at 6:32 PM Hoàng Tùng Lâm (Linus) < notifications@github.com> wrote:
So which one should I use ? Which one is more accurate ? Should I create a fake batches ? Or should I keep the batch_size = 1 when inference ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/VisualComputingInstitute/triplet-reid/issues/81?email_source=notifications&email_token=AAOJDTNSIL7AGOPJY5HQQEDPY2KK3A5CNFSM4HSFMPL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW5ENZQ#issuecomment-498747110, or mute the thread https://github.com/notifications/unsubscribe-auth/AAOJDTOUV42FIPSDFF6SUYTPY2KK3ANCNFSM4HSFMPLQ .
@lamhoangtung , Were you able to figure this out? I'm trying to follow your steps to generate embeddings and compare them. But so far I'm running into some errors:
I cannot load the model this way for some reason. #85
checkpoint = tf.train.latest_checkpoint(config['experiment_root'])
I tried loading the model this way,
saver = tf.train.import_meta_graph('experiments\my_experiment\checkpoint-25000.meta')
saver.restore(sess, 'experiments\my_experiment\checkpoint-25000')
but that still gives me an error when I try to run
emb = sess.run(endpoints['emb'], feed_dict={img: raw_img})[0]
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma
[[node resnet_v1_50/block4/unit_3/bottleneck_v1/conv3/BatchNorm/gamma/read (defined at C:\Users\mazat\Documents\Python\trinet\nets\resnet_v1.py:118) ]]
[[node head/emb/BiasAdd (defined at C:\Users\mazat\Documents\Python\trinet\heads\fc1024.py:17) ]]
Thanks
@lamhoangtung I think I figured out the first problem.
So, to get cv2 load embeddings close to the embed.py
values, I did the following.
raw_img = cv2.imread(os.path.join(config['image_root'],'query', '0001_c1s1_001051_00.jpg'))
raw_img = cv2.cvtColor(raw_img, cv2.COLOR_BGR2RGB)
raw_img = cv2.resize(raw_img, (net_input_size[1], net_input_size[0]))
raw_img = np.expand_dims(raw_img, axis=0)
If you want to get the exactly same values you can load the image with TF instead of CV2
image_encoded = tf.read_file(os.path.join(config['image_root'],'query', '0001_c1s1_001051_00.jpg'))
image_decoded = tf.image.decode_jpeg(image_encoded, channels=3)
image_resized = tf.image.resize_images(image_decoded, net_input_size)
img = tf.expand_dims(image_resized, axis=0)
# Create the model and an embedding head.
model = import_module('nets.' + config['model_name'])
head = import_module('heads.' + config['head_name'])
endpoints, _ = model.endpoints(img, is_training=False)
with tf.name_scope('head'):
endpoints = head.head(endpoints, config['embedding_dim'], is_training=False)
tf.train.Saver().restore(sess, os.path.join(config['experiment_root'],'checkpoint-25000') )
emb = sess.run(endpoints['emb'])[0]
I got almost identical embeddings this way
Hi, I'm trying to write a script to embed a single image based on your code, it's look something like this:
But the result for a same image with my code and your code are not the same.
Note that there is no any augmentation added when I compute the embedding vector.
Am I missing anything here? Thanks you for the help