Open heiwushi opened 7 years ago
Dear @heiwushi, Yes! Unfortunately, Google didn't make the PCA matrix publicly available. So I downloaded only one part of the origin video file from youtube and calculated PCA matrix by myself.
I am glad if this helps you!
Dear @LittleWat,
A good job!
But you say you downloaded only one part of the origin video files of youtube8m. If so, the matrix is maybe different from google's version. Does it mean the feature extracted using this PCA matrix is also different from youtube8m dataset? I am not sure whether the model trained on youtube8m can be applied to new videos since the PCA process is different, unless you train the model by your downloaded videos without youtube8m.
Thank you very much!
Dear @heiwushi, As you say, my PCA matrix is a little different from Google's version.
But the result of video recognition seemed to be good enough. If you have time, please try! If you have some problems, please tell me again!
@LittleWat So Nice!Your job will give me a big help.
Now I still have two small questions. I want to know if your inception-v3 model is the version released by google without any tuning, which has been pre-trained on imagenet. And is the quantization mentioned in google's paper needed?
Thanks again!
Dear @heiwushi,
I used this inception-v3 model ( https://github.com/fchollet/deep-learning-models ).
I didn't care about the quantization mentioned in google's paper.
Thanks!
Dear @LittleWat Hello, I come back ^_^. I have try using your trained model to predict video labels. But sometimes the result is not good. For example, this (https://www.youtube.com/watch?v=MxSeVP9ec64) is a video from youtube8m dataset. The labels shoude be about news. But your model gives below:
[('Football', 0.62753093), ('Vehicle', 0.46109813), ('Food', 0.16521908), ('Animal', 0.13338402), ('Car', 0.11300386), ('Trailer', 0.10679296), ('Medicine', 0.085868239), ('Fashion', 0.048945624), ('Comedy', 0.045810226), ('Newscaster', 0.043604851)].
Another example is https://www.youtube.com/watch?v=GmaPT3BACW0, also from youtube8m dateset, of which labels should be about tennis or news program. But your model gives below:
[('Food', 0.76592553), ('Football', 0.53160334), ('Games', 0.2116282), ('Medicine', 0.10419334), ('Fashion', 0.087283127), ('Cooking', 0.077669598), ('Mobile phone', 0.063449942), ('Recipe', 0.058652695), ('Smartphone', 0.049429629), ('Winter sport', 0.042630274)]
Did you train your model on whole youtube8m dataset or only on one part of it ?
Thank you for your help and your patience!
Dear @heiwushi,
Thank you for telling the problem. Yeah, this is strange....
My PCA matrix or my video recognition model seems to have some problems (maybe necessary to train more?). So I restarted to train my video video recognition model.
If the result changes, I will tell you later. Sorry for the inconvenience.
Do you use the youtube8m dataset provided by google to train you model? That dataset doesn't release the PCA matrix so that we can't predict new videos outside youtube8m. So i am not sure if you download the origin video file from youtube to calculate the PCA matrix and train you own model rather than using youtube8m dataset.