Open QLgogo opened 7 years ago
Besides, I use the 115-GB file to do test analyses by running the codes: "th main.lua -test 1 -loadsnap im2recipe_model.t7". I got many 150*1024 matrixes. Could you tell me how to explain it? That is, given an image, how do this matrix tell us the probability for each recognised ingredient?
What is wrong with this?
Nothing, other than that several images couldn't be loaded and were replaced with zeros instead of actual data. You should try to figure out why your images weren't loaded :)
I got many 150*1024 matrixes
Each matrix is a mini-batch of embeddings. There are two sets of embeddings: one for images and one for recipes. If you want to do embedding->ingredient prediction, you'll want the image embeddings and then train a new model to go from those to ingredient. I can tell you already that unless you add some special trick, it's not going to "just work."
Hi nhynes, thanks for your answers and patience. I encountered different problems and opened this new issue.
Could you tell me how large your RAM memory is when you run the code "th main.lua -test 1 -loadsnap im2recipe_model.t7
". I use 32GB RAM memory and set up another 70GB swap memory, but still get "out of memory
" error.
For embedding->ingredient prediction/recipe, do you mean I cannot directly get the results like your demo website (e.g., pineapple pie 0.89)? That is, I have to build another deep learning model by myself to do the classification by using the test results (i.e., embeddings) as input?
Sorry I want to add more details to "out of memory
" error. I kept checking the disk space and RAM space when the codes were running. "out of memory
" error occurred when there was enough disk space and RAM space. Specifically, after the screen printed "[torch.DoubleTensor of size 150x3x224x224]
" and "[torch.DoubleTensor of size 15x150x1024]
", the following error info appeared:
nil
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-4200/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/yue_zhang/torch/install/bin/luajit: ...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:179: [thread 1 endcallback] ...e/yue_zhang/torch/install/share/lua/5.1/nn/Container.lua:67:
In 1 module of nn.Sequential:
In 1 module of nn.ParallelTable:
In 5 module of nn.Sequential:
In 1 module of nn.Sequential:
In 3 module of nn.Sequential:
/home/yue_zhang/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-4200/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: in function 'v'
/home/yue_zhang/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'BatchNormalization_updateOutput'
...ng/torch/install/share/lua/5.1/nn/BatchNormalization.lua:124: in function <...ng/torch/install/share/lua/5.1/nn/BatchNormalization.lua:113>
[C]: in function 'xpcall'
...e/yue_zhang/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../yue_zhang/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../yue_zhang/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...e/yue_zhang/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
.../yue_zhang/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <.../yue_zhang/torch/install/share/lua/5.1/nn/Sequential.lua:41>
[C]: in function 'xpcall'
...
[C]: in function 'xpcall'
...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:174: in function 'dojob'
...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:223: in function 'addjob'
/home/yue_zhang/Desktop/im2recipe/drivers/test.lua:38: in function </home/yue_zhang/Desktop/im2recipe/drivers/test.lua:24>
/home/yue_zhang/Desktop/im2recipe/drivers/init.lua:47: in function </home/yue_zhang/Desktop/im2recipe/drivers/init.lua:45>
/home/yue_zhang/Desktop/im2recipe/drivers/init.lua:43: in function 'test'
main.lua:52: in main chunk
[C]: in function 'dofile'
...hang/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
...e/yue_zhang/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
.../yue_zhang/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
/home/yue_zhang/Desktop/im2recipe/drivers/test.lua:55: in function </home/yue_zhang/Desktop/im2recipe/drivers/test.lua:40>
[C]: in function 'xpcall'
...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:174: in function 'dojob'
...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:223: in function 'addjob'
/home/yue_zhang/Desktop/im2recipe/drivers/test.lua:38: in function </home/yue_zhang/Desktop/im2recipe/drivers/test.lua:24>
/home/yue_zhang/Desktop/im2recipe/drivers/init.lua:47: in function </home/yue_zhang/Desktop/im2recipe/drivers/init.lua:45>
/home/yue_zhang/Desktop/im2recipe/drivers/init.lua:43: in function 'test'
main.lua:52: in main chunk
[C]: in function 'dofile'
...hang/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
stack traceback:
[C]: in function 'error'
...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:179: in function 'dojob'
...ue_zhang/torch/install/share/lua/5.1/threads/threads.lua:223: in function 'addjob'
/home/yue_zhang/Desktop/im2recipe/drivers/test.lua:38: in function </home/yue_zhang/Desktop/im2recipe/drivers/test.lua:24>
/home/yue_zhang/Desktop/im2recipe/drivers/init.lua:47: in function </home/yue_zhang/Desktop/im2recipe/drivers/init.lua:45>
/home/yue_zhang/Desktop/im2recipe/drivers/init.lua:43: in function 'test'
main.lua:52: in main chunk
[C]: in function 'dofile'
...hang/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50
Do you have any idea? Many thanks in advance!
You ran out of CUDA (i.e. GPU memory). You need to either use more gpus using the -ngpus
option or reduce the batch size. Here's what the error message was in the trace you posted, in case you want to search the internet for more information: cuda runtime error (2) : out of memory
Yes, you are right. I tried to set -batchSize 50
and the error disappeared. Then how about embedding->ingredient prediction/recipe? Do you mean I cannot directly get the results like your demo website (e.g., pineapple pie 0.89)? That is, I have to build another deep learning model by myself to do the classification by using the test results (i.e., embeddings) as input?
how about embedding->ingredient prediction/recipe?
The demo only does one of those things. I suggest that you read the paper to get a better sense of how the model might be used for your specific application :)
Hi nhynes, I read your paper and rank.py
file. To get the ingredients from a given image, I think I can directly revise your rank.py
file to print out the test_ids
whose image-recipe pair has highest similarity. Then use the test_ids
to find back the recipe (ingredients and instructions). Is it right?
use the test_ids to find back the recipe
For sure! What I was saying was that you'd have a hard time directly predicting the ingredients from the embedding. You can always go im2recipe/recipe2im.
Sorry I wonder if the test_ids
is the same as the ids in layer1.json file? If it is, then I should be able to track back?
Yes.
Hi nhynes. Sorry it is me again. Finally I figured out how to modify the rank.py file to get the ingredient and recipe. Thanks for all above help!
However, when I replaced part of the test images (partitioned as 'test') with my own images (12000+ pieces), updated data.h5 file (including 80000+ test images) and then run th main.lua -test 1 -loadsnap im2recipe_model.t7
, I found that the number of test_ids saved in the output results
decreased to 50000+, not including any replaced test images. Could you tell me whether you add any filtering or subsampling operation when running th main.lua -test 1 -loadsnap im2recipe_model.t7
?
Hey @QLgogo, no worries! You're helping to expose challenges with deploying the model :)
To solve the problem, generally, you'll need to sort the images by recipe id so that they are in the order given by ids_test
. You'll then need to update ilens_test
to have the correct number of images for each recipe.
You only want the image embeddings though, right? In that case, it's much easier: just pick the image model out of the pretrained one and feed the images through that directly.
Hi nhynes. Yes, I only want the image embeddings and input them into rank.py
to get test_ids
. To check the structure of data.h5 (downloaded from recipe1M website), typing h5ls data.h5
shows the following:
classes_test Dataset {51334}
classes_train Dataset {238459}
classes_val Dataset {51129}
ids_test Dataset {51334}
ids_train Dataset {238459}
ids_val Dataset {51129}
ilens_test Dataset {51334}
ilens_train Dataset {238459}
ilens_val Dataset {51129}
imnames_test Dataset {82392}
imnames_train Dataset {383779}
imnames_val Dataset {82093}
impos_test Dataset {51334, 5}
impos_train Dataset {238459, 5}
impos_val Dataset {51129, 5}
ims_test Dataset {100848, 3, 256, 256}
ims_train Dataset {471557, 3, 256, 256}
ims_val Dataset {100311, 3, 256, 256}
ingrs_test Dataset {51334, 20}
ingrs_train Dataset {238459, 20}
ingrs_val Dataset {51129, 20}
numims_test Dataset {51334}
numims_train Dataset {238459}
numims_val Dataset {51129}
rbps_test Dataset {51334}
rbps_train Dataset {238459}
rbps_val Dataset {51129}
rlens_test Dataset {51334}
rlens_train Dataset {238459}
rlens_val Dataset {51129}
stvecs_test Dataset {464115, 1024}
stvecs_train Dataset {2163659, 1024}
stvecs_val Dataset {464059, 1024}
So each test_id
(total number of ids_test is 51334) is related with more than one images (total number of imgs_test is 82392). However, in rank.py
, test_ids.t7
is read in and an image name is returned based on its test_id
. That is, a test_id is related with only one image (total length of names
in rank.py
is 51334). As a result, I can only collect the corresponding recipes for 51334 images.
To find out the corresponding recipes for my own images, I plan to directly replace the content of your .jpg files with the content of my .jpg files. So I need to make sure all images to be replaced is among the 51334 images. Thus, could you tell me how you code the multiple images within a test_id? That is, how to narrow down the 82392 images into 51334 images? e.g. Do you randomly select one from the multiple ones within a test_id or concatenate them?
Since you only want the image embeddings, I highly recommend that you do not mess with the hdf5 file. Instead, try the approach of using the fine-tuned CNN directly to get the embeddings for your images.
Something like this.
Hi, QLgogo, May you share me the h5 data file, i.e., data.h5? Here is my email: wqingdaniel@gmail.com
I would greatly appreciate your assistance in sharing this data. Please let me know if you can provide this data or if there are any specific procedures I should follow to access it.
I run the code "python mk_dataset.py ..." from ./pyscripts . The screen printed the following lines:
... Loading dataset. Loading ingr vocab. ('Image path is:', '/home/yue_zhang/Desktop/im2recipe/data/recipe1M/images') ('H5 file is', '/home/yue_zhang/Desktop/im2recipe/data/h5/data.h5') {'test': 100808, 'train': 471475, 'val': 100297} Assembling dataset. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. Could not load image...Using black one instead. ... Could not load image...Using black one instead. Writing out data.
However, finally a 115 GB file is produced. What is wrong with this? Can this 115-GB file be used for the next test analyses?