vijayvee / video-captioning

This repository contains the code for a video captioning system inspired by Sequence to Sequence -- Video to Text. This system takes as input a video and generates a caption in English describing the video.
MIT License
165 stars 67 forks source link

Aborted Error #9

Closed sureshkumar96 closed 6 years ago

sureshkumar96 commented 6 years ago

when I run the extract_feats.py code, I got the following error, please resolve this

I0906 07:10:07.085180 26398 layer_factory.hpp:77] Creating layer input I0906 07:10:07.085196 26398 net.cpp:84] Creating Layer input I0906 07:10:07.085204 26398 net.cpp:380] input -> data I0906 07:10:07.085224 26398 net.cpp:122] Setting up input I0906 07:10:07.085232 26398 net.cpp:129] Top shape: 10 3 224 224 (1505280) I0906 07:10:07.085237 26398 net.cpp:137] Memory required for data: 6021120 I0906 07:10:07.085242 26398 layer_factory.hpp:77] Creating layer conv1_1 I0906 07:10:07.085253 26398 net.cpp:84] Creating Layer conv1_1 I0906 07:10:07.085258 26398 net.cpp:406] conv1_1 <- data I0906 07:10:07.085263 26398 net.cpp:380] conv1_1 -> conv1_1 F0906 07:10:07.096971 26398 cudnn_conv_layer.cpp:52] Check failed: error == cudaSuccess (30 vs. 0) unknown error Check failure stack trace: Aborted (core dumped)

vijayvee commented 6 years ago

This is a CUDA error, not a problem with the video-captioning code. Googling the error led me here: https://github.com/NVIDIA/DIGITS/issues/1663

sureshkumar96 commented 6 years ago

Sir, How much time it takes to extract features of all videos. And are the features for only 80 frames(sampled every 10th frame of all the frames.?

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10


From: Vijay Veerabadran notifications@github.com Sent: Thursday, September 6, 2018 4:47:52 PM To: vijayvee/video-captioning Cc: Optimus_Prime; Author Subject: Re: [vijayvee/video-captioning] Aborted Error (#9)

This is a CUDA error, not a problem with the video-captioning code. Googling the error led me here: NVIDIA/DIGITS#1663https://github.com/NVIDIA/DIGITS/issues/1663

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/vijayvee/video-captioning/issues/9#issuecomment-419057698, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AorZCIXM6mYUISzAPj--fSPYRYkdib7Jks5uYQRggaJpZM4Wctqk.

vijayvee commented 6 years ago

(a) https://github.com/jcjohnson/cnn-benchmarks Here are the benchmarks for various state-of-the-art CNNs including VGG16 that I used for extracting my video features.

The benchmarks say VGG-16 processes a minibatch of 16 frames (of size 224x224) in 128 ms. We have 5 minibatches per video (80=16*5). Each video should hence take about 650 ms on a GTX1080. There were 1970 videos in the dataset => roughly 20 minutes for the whole MSVD dataset I used.

Running time won't be exact but will be in this ^ ballpark, since the preprocessing could be different, and benchmarks were implemented using torch and I use Caffe for feature extraction.

(b) I had computed features for 80 evenly spaced out frames across the video (refer to np.linspace() to know how this works).

sureshkumar96 commented 6 years ago

Sir, then what is frames sampling 1 per 10 frames.

On Sat, Sep 29, 2018 at 12:46 AM Vijay Veerabadran notifications@github.com wrote:

(a) https://github.com/jcjohnson/cnn-benchmarks Here are the benchmarks for various state-of-the-art CNNs including VGG16 that I used for extracting my video features.

The benchmarks say VGG-16 processes minibatch of 16 frames (of size 224x224) in 128 ms. We have 5 minibatches per video (80=16*5). Each video should hence take about 650 ms on a GTX1080.

Running time won't be exact but will be in this ^ ballpark, since the preprocessing could be different, and benchmarks were implemented using torch and I use Caffe for feature extraction.

(b) I had computed features for 80 evenly spaced out frames across the video (refer to np.linspace() to know how this works).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vijayvee/video-captioning/issues/9#issuecomment-425538728, or mute the thread https://github.com/notifications/unsubscribe-auth/AorZCHpRplSFOy-ieODf3zeXrnE9Yi87ks5ufnV6gaJpZM4Wctqk .

-- Regards, H.Suresh Kumar, Biological Sciences, IIT Madras, Chennai-600036, 9962427931.

sureshkumar96 commented 6 years ago

---------- Forwarded message --------- From: Suresh kumar hsureshkumar96@gmail.com Date: Tue, Oct 2, 2018 at 1:35 AM Subject: Re: [vijayvee/video-captioning] Aborted Error (#9) To: < reply@reply.github.com

Sir, then what is frames sampling 1 per 10 frames. -pfa for the screenshot of the reference paper that have used.

On Sat, Sep 29, 2018 at 12:46 AM Vijay Veerabadran notifications@github.com wrote:

(a) https://github.com/jcjohnson/cnn-benchmarks Here are the benchmarks for various state-of-the-art CNNs including VGG16 that I used for extracting my video features.

The benchmarks say VGG-16 processes minibatch of 16 frames (of size 224x224) in 128 ms. We have 5 minibatches per video (80=16*5). Each video should hence take about 650 ms on a GTX1080.

Running time won't be exact but will be in this ^ ballpark, since the preprocessing could be different, and benchmarks were implemented using torch and I use Caffe for feature extraction.

(b) I had computed features for 80 evenly spaced out frames across the video (refer to np.linspace() to know how this works).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/vijayvee/video-captioning/issues/9#issuecomment-425538728, or mute the thread https://github.com/notifications/unsubscribe-auth/AorZCHpRplSFOy-ieODf3zeXrnE9Yi87ks5ufnV6gaJpZM4Wctqk .

-- Regards, H.Suresh Kumar, Biological Sciences, IIT Madras, Chennai-600036, 9962427931.

-- Regards, H.Suresh Kumar, Biological Sciences, IIT Madras, Chennai-600036, 9962427931.