Closed LekshmyHari closed 6 years ago
@LekshmyHari i suspect u did not use a high accuracy model, please read my home page at
https://github.com/chen0040/keras-video-classifier
You can take a look there for LSTM with VGG16 encoder model
In fact this is what i got "The LSTM with VGG16 (top not included)feature extractor: (accuracy around 100% for training and 98.83% for validation)"
I have used the already trained models that was available in the demo/models/UCF-101 folder. Please let me know if i am missing anything here.
These were the steps i followed.
Please help me in identifying what would have gone wrong.
Much thanks.
Hi @LekshmyHari ur step seems to be correct, as the projects only needs keras (i used tensorflow as backend and python 3.6 on my virtualenv) and other dependencies defined in the requirements.txt. I have just added in the calculation and print out of accuracy into the vgg16_bidirectional_lstm_predict.py script as well as other predict scripts, am running the script on my computer. I have not yet finished the execution of the vgg16_bidirectional_lstm_predict.py, but below is the current printout:
C:\Users\chen0\git\keras-video-classifier\venv\Scripts\python.exe C:/Users/chen0/git/keras-video-classifier/demo/vgg16_bidirectional_lstm_predict.py Using TensorFlow backend. 2018-03-21 20:23:20.179366: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 Extracting frames from video: ./very_large_data/UCF-101\ApplyEyeMakeup\v_ApplyEyeMakeup_g16_c04.avi predicted: BlowingCandles actual: ApplyEyeMakeup accuracy: 0.0 Extracting frames from video: ./very_large_data/UCF-101\BoxingPunchingBag\v_BoxingPunchingBag_g24_c07.avi predicted: BoxingPunchingBag actual: BoxingPunchingBag accuracy: 0.5 Extracting frames from video: ./very_large_data/UCF-101\ApplyEyeMakeup\v_ApplyEyeMakeup_g09_c06.avi predicted: ApplyEyeMakeup actual: ApplyEyeMakeup accuracy: 0.6666666666666666 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g09_c07.avi predicted: Billiards actual: Billiards accuracy: 0.75 Extracting frames from video: ./very_large_data/UCF-101\BodyWeightSquats\v_BodyWeightSquats_g08_c02.avi predicted: BlowingCandles actual: BodyWeightSquats accuracy: 0.6 Extracting frames from video: ./very_large_data/UCF-101\ApplyLipstick\v_ApplyLipstick_g12_c01.avi predicted: ApplyLipstick actual: ApplyLipstick accuracy: 0.6666666666666666 Extracting frames from video: ./very_large_data/UCF-101\Bowling\v_Bowling_g02_c04.avi predicted: Bowling actual: Bowling accuracy: 0.7142857142857143 Extracting frames from video: ./very_large_data/UCF-101\BasketballDunk\v_BasketballDunk_g24_c01.avi predicted: BasketballDunk actual: BasketballDunk accuracy: 0.75 Extracting frames from video: ./very_large_data/UCF-101\BaseballPitch\v_BaseballPitch_g08_c04.avi predicted: BaseballPitch actual: BaseballPitch accuracy: 0.7777777777777778 Extracting frames from video: ./very_large_data/UCF-101\BaseballPitch\v_BaseballPitch_g07_c07.avi predicted: BaseballPitch actual: BaseballPitch accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\ApplyEyeMakeup\v_ApplyEyeMakeup_g12_c05.avi predicted: BlowDryHair actual: ApplyEyeMakeup accuracy: 0.7272727272727273 Extracting frames from video: ./very_large_data/UCF-101\ApplyEyeMakeup\v_ApplyEyeMakeup_g05_c01.avi predicted: ApplyEyeMakeup actual: ApplyEyeMakeup accuracy: 0.75 Extracting frames from video: ./very_large_data/UCF-101\ApplyEyeMakeup\v_ApplyEyeMakeup_g23_c06.avi predicted: ApplyEyeMakeup actual: ApplyEyeMakeup accuracy: 0.7692307692307693 Extracting frames from video: ./very_large_data/UCF-101\BabyCrawling\v_BabyCrawling_g16_c04.avi predicted: BabyCrawling actual: BabyCrawling accuracy: 0.7857142857142857 Extracting frames from video: ./very_large_data/UCF-101\Biking\v_Biking_g18_c04.avi predicted: Biking actual: Biking accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\Bowling\v_Bowling_g13_c06.avi predicted: Bowling actual: Bowling accuracy: 0.8125 Extracting frames from video: ./very_large_data/UCF-101\Biking\v_Biking_g16_c01.avi predicted: Archery actual: Biking accuracy: 0.7647058823529411 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g08_c02.avi predicted: Billiards actual: Billiards accuracy: 0.7777777777777778 Extracting frames from video: ./very_large_data/UCF-101\BalanceBeam\v_BalanceBeam_g12_c01.avi predicted: BalanceBeam actual: BalanceBeam accuracy: 0.7894736842105263 Extracting frames from video: ./very_large_data/UCF-101\BreastStroke\v_BreastStroke_g08_c04.avi predicted: BreastStroke actual: BreastStroke accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\Biking\v_Biking_g18_c01.avi predicted: Biking actual: Biking accuracy: 0.8095238095238095 Extracting frames from video: ./very_large_data/UCF-101\BreastStroke\v_BreastStroke_g22_c01.avi predicted: BreastStroke actual: BreastStroke accuracy: 0.8181818181818182 Extracting frames from video: ./very_large_data/UCF-101\Basketball\v_Basketball_g05_c02.avi predicted: Archery actual: Basketball accuracy: 0.782608695652174 Extracting frames from video: ./very_large_data/UCF-101\BenchPress\v_BenchPress_g24_c07.avi predicted: BenchPress actual: BenchPress accuracy: 0.7916666666666666 Extracting frames from video: ./very_large_data/UCF-101\BlowingCandles\v_BlowingCandles_g01_c02.avi predicted: BlowingCandles actual: BlowingCandles accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\BalanceBeam\v_BalanceBeam_g19_c03.avi predicted: BalanceBeam actual: BalanceBeam accuracy: 0.8076923076923077 Extracting frames from video: ./very_large_data/UCF-101\Basketball\v_Basketball_g11_c03.avi predicted: Basketball actual: Basketball accuracy: 0.8148148148148148 Extracting frames from video: ./very_large_data/UCF-101\BoxingPunchingBag\v_BoxingPunchingBag_g22_c01.avi predicted: BoxingPunchingBag actual: BoxingPunchingBag accuracy: 0.8214285714285714 Extracting frames from video: ./very_large_data/UCF-101\BlowDryHair\v_BlowDryHair_g08_c03.avi predicted: BrushingTeeth actual: BlowDryHair accuracy: 0.7931034482758621 Extracting frames from video: ./very_large_data/UCF-101\BodyWeightSquats\v_BodyWeightSquats_g25_c07.avi predicted: BodyWeightSquats actual: BodyWeightSquats accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\Bowling\v_Bowling_g01_c04.avi predicted: Bowling actual: Bowling accuracy: 0.8064516129032258 Extracting frames from video: ./very_large_data/UCF-101\BasketballDunk\v_BasketballDunk_g05_c03.avi predicted: BasketballDunk actual: BasketballDunk accuracy: 0.8125 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g05_c04.avi predicted: Billiards actual: Billiards accuracy: 0.8181818181818182 Extracting frames from video: ./very_large_data/UCF-101\BodyWeightSquats\v_BodyWeightSquats_g12_c01.avi predicted: BodyWeightSquats actual: BodyWeightSquats accuracy: 0.8235294117647058 Extracting frames from video: ./very_large_data/UCF-101\ApplyLipstick\v_ApplyLipstick_g01_c04.avi predicted: ApplyLipstick actual: ApplyLipstick accuracy: 0.8285714285714286 Extracting frames from video: ./very_large_data/UCF-101\BrushingTeeth\v_BrushingTeeth_g02_c03.avi predicted: BrushingTeeth actual: BrushingTeeth accuracy: 0.8333333333333334 Extracting frames from video: ./very_large_data/UCF-101\BlowingCandles\v_BlowingCandles_g04_c01.avi predicted: BlowingCandles actual: BlowingCandles accuracy: 0.8378378378378378 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g25_c05.avi predicted: Billiards actual: Billiards accuracy: 0.8421052631578947 Extracting frames from video: ./very_large_data/UCF-101\ApplyEyeMakeup\v_ApplyEyeMakeup_g10_c01.avi predicted: BrushingTeeth actual: ApplyEyeMakeup accuracy: 0.8205128205128205 Extracting frames from video: ./very_large_data/UCF-101\BreastStroke\v_BreastStroke_g16_c01.avi predicted: BreastStroke actual: BreastStroke accuracy: 0.825 Extracting frames from video: ./very_large_data/UCF-101\BreastStroke\v_BreastStroke_g12_c02.avi predicted: Archery actual: BreastStroke accuracy: 0.8048780487804879 Extracting frames from video: ./very_large_data/UCF-101\ApplyLipstick\v_ApplyLipstick_g19_c03.avi predicted: ApplyLipstick actual: ApplyLipstick accuracy: 0.8095238095238095 Extracting frames from video: ./very_large_data/UCF-101\Biking\v_Biking_g12_c03.avi predicted: Biking actual: Biking accuracy: 0.813953488372093 Extracting frames from video: ./very_large_data/UCF-101\BlowDryHair\v_BlowDryHair_g06_c06.avi predicted: BlowDryHair actual: BlowDryHair accuracy: 0.8181818181818182 Extracting frames from video: ./very_large_data/UCF-101\Archery\v_Archery_g06_c02.avi predicted: Bowling actual: Archery accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\Biking\v_Biking_g15_c01.avi predicted: Biking actual: Biking accuracy: 0.8043478260869565 Extracting frames from video: ./very_large_data/UCF-101\Bowling\v_Bowling_g05_c05.avi predicted: Bowling actual: Bowling accuracy: 0.8085106382978723 Extracting frames from video: ./very_large_data/UCF-101\Bowling\v_Bowling_g12_c01.avi predicted: BlowingCandles actual: Bowling accuracy: 0.7916666666666666 Extracting frames from video: ./very_large_data/UCF-101\BreastStroke\v_BreastStroke_g01_c02.avi predicted: BreastStroke actual: BreastStroke accuracy: 0.7959183673469388 Extracting frames from video: ./very_large_data/UCF-101\Bowling\v_Bowling_g10_c04.avi predicted: Bowling actual: Bowling accuracy: 0.8 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g25_c01.avi predicted: Billiards actual: Billiards accuracy: 0.803921568627451 Extracting frames from video: ./very_large_data/UCF-101\BaseballPitch\v_BaseballPitch_g05_c06.avi predicted: BaseballPitch actual: BaseballPitch accuracy: 0.8076923076923077 Extracting frames from video: ./very_large_data/UCF-101\BandMarching\v_BandMarching_g05_c04.avi predicted: BandMarching actual: BandMarching accuracy: 0.8113207547169812 Extracting frames from video: ./very_large_data/UCF-101\ApplyLipstick\v_ApplyLipstick_g22_c01.avi predicted: ApplyLipstick actual: ApplyLipstick accuracy: 0.8148148148148148 Extracting frames from video: ./very_large_data/UCF-101\BenchPress\v_BenchPress_g04_c06.avi predicted: BenchPress actual: BenchPress accuracy: 0.8181818181818182 Extracting frames from video: ./very_large_data/UCF-101\BoxingPunchingBag\v_BoxingPunchingBag_g05_c02.avi predicted: BoxingPunchingBag actual: BoxingPunchingBag accuracy: 0.8214285714285714 Extracting frames from video: ./very_large_data/UCF-101\BasketballDunk\v_BasketballDunk_g09_c01.avi predicted: BalanceBeam actual: BasketballDunk accuracy: 0.8070175438596491 Extracting frames from video: ./very_large_data/UCF-101\BenchPress\v_BenchPress_g04_c05.avi predicted: BenchPress actual: BenchPress accuracy: 0.8103448275862069 Extracting frames from video: ./very_large_data/UCF-101\BasketballDunk\v_BasketballDunk_g25_c03.avi predicted: BasketballDunk actual: BasketballDunk accuracy: 0.8135593220338984 Extracting frames from video: ./very_large_data/UCF-101\Basketball\v_Basketball_g02_c06.avi predicted: Basketball actual: Basketball accuracy: 0.8166666666666667 Extracting frames from video: ./very_large_data/UCF-101\BandMarching\v_BandMarching_g05_c06.avi predicted: BandMarching actual: BandMarching accuracy: 0.819672131147541 Extracting frames from video: ./very_large_data/UCF-101\BandMarching\v_BandMarching_g20_c03.avi predicted: BandMarching actual: BandMarching accuracy: 0.8225806451612904 Extracting frames from video: ./very_large_data/UCF-101\BaseballPitch\v_BaseballPitch_g12_c02.avi predicted: BaseballPitch actual: BaseballPitch accuracy: 0.8253968253968254 Extracting frames from video: ./very_large_data/UCF-101\BlowingCandles\v_BlowingCandles_g05_c02.avi predicted: BlowingCandles actual: BlowingCandles accuracy: 0.828125 Extracting frames from video: ./very_large_data/UCF-101\Archery\v_Archery_g09_c05.avi predicted: Archery actual: Archery accuracy: 0.8307692307692308 Extracting frames from video: ./very_large_data/UCF-101\BrushingTeeth\v_BrushingTeeth_g13_c03.avi predicted: BrushingTeeth actual: BrushingTeeth accuracy: 0.8333333333333334 Extracting frames from video: ./very_large_data/UCF-101\Archery\v_Archery_g06_c06.avi predicted: Archery actual: Archery accuracy: 0.835820895522388 Extracting frames from video: ./very_large_data/UCF-101\BasketballDunk\v_BasketballDunk_g05_c02.avi predicted: BasketballDunk actual: BasketballDunk accuracy: 0.8382352941176471 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g18_c03.avi predicted: Billiards actual: Billiards accuracy: 0.8405797101449275 Extracting frames from video: ./very_large_data/UCF-101\BoxingSpeedBag\v_BoxingSpeedBag_g17_c07.avi predicted: BoxingSpeedBag actual: BoxingSpeedBag accuracy: 0.8428571428571429 Extracting frames from video: ./very_large_data/UCF-101\BandMarching\v_BandMarching_g05_c07.avi predicted: BandMarching actual: BandMarching accuracy: 0.8450704225352113 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g16_c04.avi predicted: Billiards actual: Billiards accuracy: 0.8472222222222222 Extracting frames from video: ./very_large_data/UCF-101\BabyCrawling\v_BabyCrawling_g09_c03.avi predicted: Archery actual: BabyCrawling accuracy: 0.8356164383561644 Extracting frames from video: ./very_large_data/UCF-101\BrushingTeeth\v_BrushingTeeth_g07_c06.avi predicted: BrushingTeeth actual: BrushingTeeth accuracy: 0.8378378378378378 Extracting frames from video: ./very_large_data/UCF-101\BenchPress\v_BenchPress_g21_c02.avi predicted: BenchPress actual: BenchPress accuracy: 0.84 Extracting frames from video: ./very_large_data/UCF-101\Billiards\v_Billiards_g19_c02.avi predicted: Billiards actual: Billiards accuracy: 0.8421052631578947 Extracting frames from video: ./very_large_data/UCF-101\BoxingPunchingBag\v_BoxingPunchingBag_g25_c02.avi predicted: BoxingPunchingBag actual: BoxingPunchingBag accuracy: 0.8441558441558441 Extracting frames from video: ./very_large_data/UCF-101\BaseballPitch\v_BaseballPitch_g09_c04.avi predicted: BaseballPitch actual: BaseballPitch accuracy: 0.8461538461538461 Extracting frames from video: ./very_large_data/UCF-101\Basketball\v_Basketball_g05_c04.avi predicted: Basketball actual: Basketball accuracy: 0.8481012658227848 Extracting frames from video: ./very
Thanks for the reply @chen0040
I downloaded the uploaded files and heres what my output looks at the moment. It is still running.
python3 vgg16_bidirectional_lstm_predict.py
/usr/local/lib/python3.5/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-03-22 09:48:41.448405: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Extracting frames from video: ./very_large_data/UCF-101/WalkingWithDog/v_WalkingWithDog_g09_c04.avi
predicted: Archery actual: WalkingWithDog
accuracy: 0.0
Extracting frames from video: ./very_large_data/UCF-101/CleanAndJerk/v_CleanAndJerk_g14_c05.avi
predicted: BalanceBeam actual: CleanAndJerk
accuracy: 0.0
Extracting frames from video: ./very_large_data/UCF-101/ParallelBars/v_ParallelBars_g24_c04.avi
predicted: BandMarching actual: ParallelBars
accuracy: 0.0
Extracting frames from video: ./very_large_data/UCF-101/TrampolineJumping/v_TrampolineJumping_g22_c06.avi
predicted: Basketball actual: TrampolineJumping
accuracy: 0.0
Extracting frames from video: ./very_large_data/UCF-101/WalkingWithDog/v_WalkingWithDog_g02_c01.avi
predicted: BandMarching actual: WalkingWithDog
accuracy: 0.0
Extracting frames from video: ./very_large_data/UCF-101/Archery/v_Archery_g16_c04.avi
predicted: Archery actual: Archery
accuracy: 0.16666666666666666
Extracting frames from video: ./very_large_data/UCF-101/Rowing/v_Rowing_g09_c02.avi
predicted: BlowingCandles actual: Rowing
accuracy: 0.14285714285714285
Extracting frames from video: ./very_large_data/UCF-101/LongJump/v_LongJump_g09_c06.avi
predicted: BasketballDunk actual: LongJump
accuracy: 0.125
Extracting frames from video: ./very_large_data/UCF-101/BaseballPitch/v_BaseballPitch_g03_c07.avi
predicted: BaseballPitch actual: BaseballPitch
accuracy: 0.2222222222222222
Extracting frames from video: ./very_large_data/UCF-101/Archery/v_Archery_g07_c05.avi
predicted: BodyWeightSquats actual: Archery
accuracy: 0.2
Extracting frames from video: ./very_large_data/UCF-101/ParallelBars/v_ParallelBars_g06_c01.avi
predicted: Basketball actual: ParallelBars
accuracy: 0.18181818181818182
Extracting frames from video: ./very_large_data/UCF-101/TrampolineJumping/v_TrampolineJumping_g02_c05.avi
predicted: BaseballPitch actual: TrampolineJumping
accuracy: 0.16666666666666666
Extracting frames from video: ./very_large_data/UCF-101/JumpingJack/v_JumpingJack_g11_c03.avi
predicted: BenchPress actual: JumpingJack
accuracy: 0.15384615384615385
Extracting frames from video: ./very_large_data/UCF-101/WalkingWithDog/v_WalkingWithDog_g23_c01.avi
predicted: Basketball actual: WalkingWithDog
accuracy: 0.14285714285714285
Extracting frames from video: ./very_large_data/UCF-101/BodyWeightSquats/v_BodyWeightSquats_g15_c02.avi
predicted: BodyWeightSquats actual: BodyWeightSquats
accuracy: 0.2
Extracting frames from video: ./very_large_data/UCF-101/SumoWrestling/v_SumoWrestling_g12_c04.avi
predicted: BasketballDunk actual: SumoWrestling
accuracy: 0.1875
Extracting frames from video: ./very_large_data/UCF-101/BodyWeightSquats/v_BodyWeightSquats_g20_c05.avi
predicted: BodyWeightSquats actual: BodyWeightSquats
accuracy: 0.23529411764705882
Extracting frames from video: ./very_large_data/UCF-101/LongJump/v_LongJump_g23_c01.avi
predicted: BenchPress actual: LongJump
accuracy: 0.2222222222222222
Extracting frames from video: ./very_large_data/UCF-101/TrampolineJumping/v_TrampolineJumping_g17_c04.avi
predicted: BandMarching actual: TrampolineJumping
accuracy: 0.21052631578947367
Extracting frames from video: ./very_large_data/UCF-101/JumpingJack/v_JumpingJack_g10_c03.avi
predicted: BasketballDunk actual: JumpingJack
accuracy: 0.2
Extracting frames from video: ./very_large_data/UCF-101/Rowing/v_Rowing_g06_c04.avi
predicted: Basketball actual: Rowing
accuracy: 0.19047619047619047
Extracting frames from video: ./very_large_data/UCF-101/Rowing/v_Rowing_g13_c01.avi
predicted: Archery actual: Rowing
accuracy: 0.18181818181818182
Extracting frames from video: ./very_large_data/UCF-101/PlayingTabla/v_PlayingTabla_g04_c06.avi
predicted: BlowingCandles actual: PlayingTabla
accuracy: 0.17391304347826086
Extracting frames from video: ./very_large_data/UCF-101/PushUps/v_PushUps_g20_c01.avi
predicted: BreastStroke actual: PushUps
accuracy: 0.16666666666666666
Extracting frames from video: ./very_large_data/UCF-101/SumoWrestling/v_SumoWrestling_g06_c04.avi
predicted: BandMarching actual: SumoWrestling
accuracy: 0.16
Extracting frames from video: ./very_large_data/UCF-101/Knitting/v_Knitting_g11_c01.avi
predicted: BrushingTeeth actual: Knitting
accuracy: 0.15384615384615385
Extracting frames from video: ./very_large_data/UCF-101/Archery/v_Archery_g23_c02.avi
predicted: Archery actual: Archery
accuracy: 0.18518518518518517
Extracting frames from video: ./very_large_data/UCF-101/Archery/v_Archery_g10_c06.avi
predicted: Archery actual: Archery
accuracy: 0.21428571428571427
Extracting frames from video: ./very_large_data/UCF-101/CricketShot/v_CricketShot_g25_c04.avi
predicted: Basketball actual: CricketShot
accuracy: 0.20689655172413793
Extracting frames from video: ./very_large_data/UCF-101/ParallelBars/v_ParallelBars_g07_c06.avi
predicted: BalanceBeam actual: ParallelBars
accuracy: 0.2
Extracting frames from video: ./very_large_data/UCF-101/PushUps/v_PushUps_g24_c02.avi
predicted: BlowingCandles actual: PushUps
accuracy: 0.1935483870967742
I hope there is no dependency with the order in which directories are considered for testing.
If you have any suggestions on how to increase the accuracy, kindly let me know.
there is no dependency with the order in which directories are considered. Your results does indicate the accuracy is very low compare to mine. it might be that on your computer there is some issue with the class_id ==> mapping to class_label. I will try to get a clean install to test out and see whether i can duplicate the accuracy issue you encountered
@LekshmyHari after some examination on the codes for scan_ucf, i suspect what u said is right, in scan_ucf() i loaded the first 20 video sub folders (20 classes) in the UCF-101 directory for training. but if you are running on a computer that have different folder order than mine, it is likely that the video sub-folders executed on your computer is different from mine. I am currently doing a fix and see if i can load the correct folders for prediction like mine
Many thanks for the efforts.
Please also let me know if you have any related papers on this topic.
@LekshmyHari I have updated the scan_ucf in the predict scripts with scan_ucf_with_labels which should fetch the right video sub-folders to test the prediction, can u git pull and re-run. thanks
Seems fine now. Great !
Please let me know of any related paper for this work.
thanks for reporting the issue. I developed the current deep learning models for fun and as a hobby project. it was not published as a paper :)
Thats great :). Thanks for helping.
I have one doubt. What exactly is the difference btw 'top included' and 'top not included'.
hi the top included refers to use the top layer of the VGG16 network as encoded output for the frame, 'top not include' refers to the use the next-to-top layer (flattened) of the VGG16 network as the encoded output for the video frame
I was trying to understand a bit more about the architecture as i am aiming to train the model on a new dataset. Please help me in clarifying the below points.
MAX_ALLOWED_FRAMES controls the shape(1) of the input data to the bidirectional LSTM layer (remeber that LSTM lay has a 3 dimensional shape. upon VGG16 feature extraction, each video is converted to a 2 dimensional array of (MAX_ALLOWED_FRAMES, frame_feature_dimension), and fed into the recurrent network as a batch of 3 dimension (batch_size, MAX_ALLOWED_FRAMES, frame_feature_dimension)
the train_test_split from scikit-learn does random samples the dataset (of course u can defined a random state which allows u to reproduce results if need to), I am going to change the api such that you can pass in optional arguments to the fit() method of the test_size, and random_state
the UCF-101 contains labeled data of the categories listing as subfolders in the UCF-101 downloaded, therefore the labels are there but limited to the list, you can find out more about UCF-101 from http://crcv.ucf.edu/data/UCF101.php
Much thanks for your help.
:) i was just trying to get some clarity on the training process in fact, like on what basis the learning is happening. We have a dataset with certain sub folders, the name of which indicate the activity in the videos within it. So during training the model learns the temporal and spatial features of each video and weights are adjusted till the predicted activity is same as the activity mentioned in the name of the directory. Am i right about this please?
hello, yes u r right.
Thanks. As i mentioned earlier i am trying to test it in a new dataset. The dataset is kind of untrimmed (each video may be about 1 minute long) and the activity we are interested in may occur towards the middle of the video. Would the model work in such scenario too? Please share your thoughts. Will anyhow let you know the accuracy once i finish doing testing.
sorry i cannot help u with stuff outside my open source proj
Hi @chen0040 have you encountered any issues while training. I tried to do the training by executing vgg16_bidirectional_lstm_train.py. There were no error messages. But it is kind of stuck at Epoch 1/20. I dont see any other messages than "Epoch 1/20". Tried the same in GPU machine as well, still no luck.
Hii @LekshmyHari, I did not encounter the issue you mention while training using the latest version of the code. Below is the output i currently run from PyCharm. I have also tried to run the program from command line and you can see the progress of the epoch from the screenshot below. Maybe you can git pull the source code and try again. there might be some updates that will fix the issue you mentioned
=========================================================================================================================================================
C:\Users\chen0\git\keras-video-classifier\venv\Scripts\python.exe C:/Users/chen0/git/keras-video-classifier/demo/vgg16_bidirectional_lstm_train.py Using TensorFlow backend. 2018-04-13 14:45:16.798605: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 max frames: 32 expected frames: 7 {'ApplyEyeMakeup': 0, 'ApplyLipstick': 1, 'Archery': 2, 'BabyCrawling': 3, 'BalanceBeam': 4, 'BandMarching': 5, 'BaseballPitch': 6, 'Basketball': 7, 'BasketballDunk': 8, 'BenchPress': 9, 'Biking': 10, 'Billiards': 11, 'BlowDryHair': 12, 'BlowingCandles': 13, 'BodyWeightSquats': 14, 'Bowling': 15, 'BoxingPunchingBag': 16, 'BoxingSpeedBag': 17, 'BreastStroke': 18, 'BrushingTeeth': 19} Epoch 1/20
1/29 [>.............................] - ETA: 8:05 - loss: 2.9956 - acc: 0.0469 2/29 [=>............................] - ETA: 4:11 - loss: 2.9923 - acc: 0.0938 3/29 [==>...........................] - ETA: 2:49 - loss: 2.9937 - acc: 0.0938 4/29 [===>..........................] - ETA: 2:07 - loss: 2.9902 - acc: 0.0977 5/29 [====>.........................] - ETA: 1:40 - loss: 2.9785 - acc: 0.1187 6/29 [=====>........................] - ETA: 1:22 - loss: 2.9766 - acc: 0.1120 7/29 [======>.......................] - ETA: 1:09 - loss: 2.9617 - acc: 0.1362 8/29 [=======>......................] - ETA: 59s - loss: 2.9447 - acc: 0.1367 9/29 [========>.....................] - ETA: 51s - loss: 2.9150 - acc: 0.1476 10/29 [=========>....................] - ETA: 45s - loss: 2.9069 - acc: 0.1484 11/29 [==========>...................] - ETA: 39s - loss: 2.9023 - acc: 0.1463 12/29 [===========>..................] - ETA: 35s - loss: 2.8873 - acc: 0.1510 13/29 [============>.................] - ETA: 31s - loss: 2.8704 - acc: 0.1526 14/29 [=============>................] - ETA: 27s - loss: 2.8577 - acc: 0.1518 15/29 [==============>...............] - ETA: 24s - loss: 2.8429 - acc: 0.1573 16/29 [===============>..............] - ETA: 21s - loss: 2.8251 - acc: 0.1631 17/29 [================>.............] - ETA: 19s - loss: 2.8095 - acc: 0.1645 18/29 [=================>............] - ETA: 17s - loss: 2.7811 - acc: 0.1684 19/29 [==================>...........] - ETA: 15s - loss: 2.7818 - acc: 0.1661 20/29 [===================>..........] - ETA: 13s - loss: 2.7753 - acc: 0.1695 21/29 [====================>.........] - ETA: 11s - loss: 2.7501 - acc: 0.1786 22/29 [=====================>........] - ETA: 9s - loss: 2.7285 - acc: 0.1839 23/29 [======================>.......] - ETA: 8s - loss: 2.7037 - acc: 0.1923 24/29 [=======================>......] - ETA: 6s - loss: 2.6925 - acc: 0.1921 25/29 [========================>.....] - ETA: 5s - loss: 2.6841 - acc: 0.1900 26/29 [=========================>....] - ETA: 3s - loss: 2.6648 - acc: 0.1923 27/29 [==========================>...] - ETA: 2s - loss: 2.6532 - acc: 0.1944 28/29 [===========================>..] - ETA: 1s - loss: 2.6438 - acc: 0.1936 29/29 [==============================] - 38s 1s/step - loss: 2.6290 - acc: 0.1967 - val_loss: 2.1612 - val_acc: 0.2552 Epoch 2/20
1/29 [>.............................] - ETA: 45s - loss: 2.2218 - acc: 0.2500 2/29 [=>............................] - ETA: 34s - loss: 2.1187 - acc: 0.2969 3/29 [==>...........................] - ETA: 29s - loss: 2.1836 - acc: 0.2448 4/29 [===>..........................] - ETA: 25s - loss: 2.2066 - acc: 0.2500 5/29 [====>.........................] - ETA: 22s - loss: 2.1621 - acc: 0.2750 6/29 [=====>........................] - ETA: 19s - loss: 2.1646 - acc: 0.2708 7/29 [======>.......................] - ETA: 17s - loss: 2.1305 - acc: 0.2879 8/29 [=======>......................] - ETA: 16s - loss: 2.1242 - acc: 0.2930 9/29 [========>.....................] - ETA: 14s - loss: 2.1103 - acc: 0.3073 10/29 [=========>....................] - ETA: 14s - loss: 2.1007 - acc: 0.3047 11/29 [==========>...................] - ETA: 13s - loss: 2.1113 - acc: 0.3026 12/29 [===========>..................] - ETA: 12s - loss: 2.0917 - acc: 0.3073 13/29 [============>.................] - ETA: 11s - loss: 2.0805 - acc: 0.3101 14/29 [=============>................] - ETA: 10s - loss: 2.0676 - acc: 0.3136 15/29 [==============>...............] - ETA: 9s - loss: 2.0622 - acc: 0.3146 16/29 [===============>..............] - ETA: 8s - loss: 2.0566 - acc: 0.3223 17/29 [================>.............] - ETA: 8s - loss: 2.0457 - acc: 0.3263 18/29 [=================>............] - ETA: 7s - loss: 2.0292 - acc: 0.3325 19/29 [==================>...........] - ETA: 6s - loss: 2.0320 - acc: 0.3281 20/29 [===================>..........] - ETA: 5s - loss: 2.0353 - acc: 0.3273 21/29 [====================>.........] - ETA: 5s - loss: 2.0170 - acc: 0.3341 22/29 [=====================>........] - ETA: 4s - loss: 2.0040 - acc: 0.3409 23/29 [======================>.......] - ETA: 3s - loss: 1.9922 - acc: 0.3471 24/29 [=======================>......] - ETA: 3s - loss: 1.9869 - acc: 0.3483 25/29 [========================>.....] - ETA: 2s - loss: 1.9854 - acc: 0.3463 26/29 [=========================>....] - ETA: 1s - loss: 1.9771 - acc: 0.3474 27/29 [==========================>...] - ETA: 1s - loss: 1.9722 - acc: 0.3461 28/29 [===========================>..] - ETA: 0s - loss: 1.9702 - acc: 0.3460 29/29 [==============================] - 20s 706ms/step - loss: 1.9611 - acc: 0.3481 - val_loss: 1.6664 - val_acc: 0.3971 Epoch 3/20
1/29 [>.............................] - ETA: 37s - loss: 1.7314 - acc: 0.3594 2/29 [=>............................] - ETA: 32s - loss: 1.6399 - acc: 0.4141 3/29 [==>...........................] - ETA: 28s - loss: 1.7221 - acc: 0.3750 4/29 [===>..........................] - ETA: 24s - loss: 1.7808 - acc: 0.3516 5/29 [====>.........................] - ETA: 21s - loss: 1.7418 - acc: 0.3719 6/29 [=====>........................] - ETA: 19s - loss: 1.7320 - acc: 0.3828 7/29 [======>.......................] - ETA: 17s - loss: 1.6980 - acc: 0.3996 8/29 [=======>......................] - ETA: 16s - loss: 1.6935 - acc: 0.4004 9/29 [========>.....................] - ETA: 14s - loss: 1.6862 - acc: 0.4062 10/29 [=========>....................] - ETA: 13s - loss: 1.6900 - acc: 0.4047 11/29 [==========>...................] - ETA: 12s - loss: 1.7079 - acc: 0.3949 12/29 [===========>..................] - ETA: 11s - loss: 1.6947 - acc: 0.4010 13/29 [============>.................] - ETA: 10s - loss: 1.6932 - acc: 0.4062 14/29 [=============>................] - ETA: 9s - loss: 1.6863 - acc: 0.4096 15/29 [==============>...............] - ETA: 9s - loss: 1.6833 - acc: 0.4094 16/29 [===============>..............] - ETA: 8s - loss: 1.6808 - acc: 0.4131 17/29 [================>.............] - ETA: 7s - loss: 1.6674 - acc: 0.4154 18/29 [=================>............] - ETA: 6s - loss: 1.6488 - acc: 0.4236 19/29 [==================>...........] - ETA: 6s - loss: 1.6457 - acc: 0.4227 20/29 [===================>..........] - ETA: 5s - loss: 1.6468 - acc: 0.4234
On Fri, Apr 13, 2018 at 2:21 PM, LekshmyHari notifications@github.com wrote:
Hi @chen0040 https://github.com/chen0040 have you encountered any issues while training. I tried to do the training by executing vgg16_bidirectional_lstm_train.py. There were no error messages. But it is kind of stuck at Epoch 1/20. I dont see any other messages than "Epoch 1/20". Tried the same in GPU machine as well, still no luck.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/chen0040/keras-video-classifier/issues/1#issuecomment-381035885, or mute the thread https://github.com/notifications/unsubscribe-auth/ADPpdsWWHes9jOmjzyAk1HUpVlN2z72_ks5toEPsgaJpZM4SzEb5 .
@LekshmyHari I am speculating another possibility that the epoch got stuck (though it sounds unlikely) that your versions of dependencies are outdated. If you suspect this might be the case, you can take a look at the latest requirements.txt in which i specify the version of some of the dependencies i use in my environment (e.g. keras, tensorflow, numpy) and see if updating these dependencies fix the issue you encountered
I have started the training after updating the environment. I am getting the below error.
C:\Users\lekshmy\Desktop\PSAV\keras-video-classifier-master\demo>python vgg16_bi
directional_lstmtrain.py
C:\Users\lekshmy\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\
_init__.py:36: FutureWarning: Conversion of the second argument of issubdtype fr
om float
to np.floating
is deprecated. In future, it will be treated as np. float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-04-29 09:57:48.266561: I T:\src\github\tensorflow\tensorflow\core\platform\
cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow bi
nary was not compiled to use: AVX2
max frames: 4730
expected frames: 125
{'Abuse': 0, 'Explosion': 1, 'Fighting': 2, 'RoadAccidents': 3, 'Shooting': 4, '
Shoplifting': 5}
Epoch 1/20
Traceback (most recent call last):
File "vgg16_bidirectional_lstm_train.py", line 38, in
When i searched for this error got a suggestion that we have to call model.flatten() before adding the dense layer. But it ended in the below error.
_C:\Users\lekshmy\Desktop\PSAV\keras-video-classifier-master\demo>python vgg16_bi
directional_lstmtrain.py
C:\Users\lekshmy\AppData\Local\Programs\Python\Python36\lib\site-packages\h5py\
_init__.py:36: FutureWarning: Conversion of the second argument of issubdtype fr
om float
to np.floating
is deprecated. In future, it will be treated as np. float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
2018-04-29 10:27:50.095026: I T:\src\github\tensorflow\tensorflow\core\platform\
cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow bi
nary was not compiled to use: AVX2
max frames: 4730
expected frames: 125
{'Abuse': 0, 'Explosion': 1, 'Fighting': 2, 'RoadAccidents': 3, 'Shooting': 4, '
Shoplifting': 5}
Traceback (most recent call last):
File "vgg16_bidirectional_lstm_train.py", line 38, in
it does sound like you have some issues with dependency versions. Could you share with me the pip freezed requirements.txt of your environment? I will try to reproduce the error you had. Furthermore, I have uploaded the frozen version of the my venv requirements.txt (https://github.com/chen0040/keras-video-classifier/blob/master/requirements-on-my-python-env.txt), you can compare the keras + python + tensorflow versions with those of mine.
Here is the pip freeze output.
absl-py==0.1.13 astor==0.6.2 bleach==1.5.0 cycler==0.10.0 gast==0.2.0 gevent==1.2.2 greenlet==0.4.13 grpcio==1.10.0 h5py==2.7.1 html5lib==0.9999999 Keras==2.1.5 kiwisolver==1.0.1 Markdown==2.6.11 matplotlib==2.2.2 numpy==1.14.2 opencv-python==3.4.0.12 patool==1.12 Pillow==5.1.0 protobuf==3.5.2.post1 pyparsing==2.2.0 python-dateutil==2.7.2 pytz==2018.3 PyYAML==3.12 scikit-learn==0.19.1 scipy==1.0.1 six==1.11.0 tensorboard==1.7.0 tensorflow==1.7.0 termcolor==1.1.0 Werkzeug==0.14.1
And python version is 3.6.2.
Hi @chen0040 , do you think the versions are a problem? Should i change that and check? Kindly let me know your views.
Yes, pls change the version. if you have the same version as that on my comp (basically keras and tensorflow and numpy), it should work fine. I have been quite busy with my works recently, but i will try to find some time to test the latest version of keras and tensorflow like those in your freezed requirements
Hi, changed the versions. Please find below the pip freeze output and output of training.
C:\Users\lekshmy\Desktop\PSAV\keras-video-classifier-master\demo>pip3 freeze absl-py==0.1.13 astor==0.6.2 bleach==1.5.0 cycler==0.10.0 enum34==1.1.6 gast==0.2.0 gevent==1.2.2 greenlet==0.4.13 grpcio==1.10.0 h5py==2.7.1 html5lib==0.9999999 Keras==2.1.2 kiwisolver==1.0.1 Markdown==2.6.11 matplotlib==2.2.2 numpy==1.13.3 opencv-python==3.4.0.12 patool==1.12 Pillow==5.1.0 protobuf==3.5.2.post1 pyparsing==2.2.0 python-dateutil==2.7.2 pytz==2018.3 PyYAML==3.12 scikit-learn==0.19.1 scipy==1.0.1 six==1.11.0 tensorflow==1.4.0 tensorflow-tensorboard==0.4.0 termcolor==1.1.0 Werkzeug==0.14.1
C:\Users\lekshmy\Desktop\PSAV\keras-video-classifier-master\demo>python vgg16_bi
directional_lstm_train.py
Using TensorFlow backend.
2018-05-03 09:16:31.439479: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\
36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instruct
ions that this TensorFlow binary was not compiled to use: AVX AVX2
max frames: 2224
expected frames: 112
{'Abuse': 0, 'Explosion': 1, 'Fighting': 2, 'RoadAccidents': 3, 'Shooting': 4, '
Shoplifting': 5}
Epoch 1/20
Traceback (most recent call last):
File "vgg16_bidirectional_lstm_train.py", line 38, in
The error looks like the input to the LSTM is wrong, For keras (or any recurrent network implementation in general), the LSTM input should be shape=(batch_size, sequence_length, input_size), where the input size will be the dimension of image feature extracted from the VGG16, sequence_length is the expected frames. However, ur codes seem to indicate the input to the LSTM is only 2-d shape not 3-d. Without the actual samples of your input files for training, i cannot perform further analysis. Please see whether this information help or provide a way for me to test your input for the training
Will i sharing any of the numpy files obtained after feature extraction helps?
Hi @chen0040 what was the test accuracy you have obtained? I tried to test the models. This is the output i am getting while testing LSTM and bidrectional LSTM models.
('Extracting frames from video: ', './very_large_data/UCF-101/CricketShot/v_CricketShot_g10_c06.avi') predicted: Archery actual: CricketShot ('Extracting frames from video: ', './very_large_data/UCF-101/PushUps/v_PushUps_g02_c01.avi') predicted: BabyCrawling actual: PushUps ('Extracting frames from video: ', './very_large_data/UCF-101/BaseballPitch/v_BaseballPitch_g03_c03.avi') predicted: BaseballPitch actual: BaseballPitch ('Extracting frames from video: ', './very_large_data/UCF-101/SumoWrestling/v_SumoWrestling_g23_c03.avi') predicted: Billiards actual: SumoWrestling ('Extracting frames from video: ', './very_large_data/UCF-101/BaseballPitch/v_BaseballPitch_g22_c04.avi') predicted: BaseballPitch actual: BaseballPitch ('Extracting frames from video: ', './very_large_data/UCF-101/LongJump/v_LongJump_g04_c01.avi') predicted: BaseballPitch actual: LongJump ('Extracting frames from video: ', './very_large_data/UCF-101/SumoWrestling/v_SumoWrestling_g05_c04.avi') predicted: Basketball actual: SumoWrestling ('Extracting frames from video: ', './very_large_data/UCF-101/JumpingJack/v_JumpingJack_g05_c06.avi') predicted: BoxingPunchingBag actual: JumpingJack ('Extracting frames from video: ', './very_large_data/UCF-101/ParallelBars/v_ParallelBars_g15_c04.avi') predicted: BalanceBeam actual: ParallelBars ('Extracting frames from video: ', './very_large_data/UCF-101/TrampolineJumping/v_TrampolineJumping_g01_c03.avi') predicted: Basketball actual: TrampolineJumping ('Extracting frames from video: ', './very_large_data/UCF-101/BodyWeightSquats/v_BodyWeightSquats_g23_c02.avi') predicted: BodyWeightSquats actual: BodyWeightSquats ('Extracting frames from video: ', './very_large_data/UCF-101/Diving/v_Diving_g09_c07.avi') predicted: BaseballPitch actual: Diving ('Extracting frames from video: ', './very_large_data/UCF-101/Diving/v_Diving_g20_c03.avi') predicted: BalanceBeam actual: Diving ('Extracting frames from video: ', './very_large_data/UCF-101/JumpingJack/v_JumpingJack_g21_c01.avi') predicted: BaseballPitch actual: JumpingJack ('Extracting frames from video: ', './very_large_data/UCF-101/LongJump/v_LongJump_g03_c02.avi')