Closed naykun closed 5 years ago
Hello, thank you for your words of appreciation. Yes, I can share code for those feature extraction step, but I'll need a few days as I'm traveling now.
For the moment:
I'll share code snippets to obtain such results soon.
Much appreciated!
I finally added a few example scripts on feature extraction in https://github.com/fpv-iplab/rulstm/commit/cbdb3dd47ed0ef04186d7911b97f2adbd9542063. Hope this helps :)
I finally added a few example scripts on feature extraction in cbdb3dd. Hope this helps :)
Thanks a lot!
Hello, thank you for your words of appreciation. Yes, I can share code for those feature extraction step, but I'll need a few days as I'm traveling now.
For the moment:
- Concerning TSN features: we use the 1024-dimensional features right after global average pooling in BNInception. That is, we get rid of the last FC layer which computes class scores.
- Concerning Faster R-CNN: we use it to detect bounding boxes, then for each frame we discard the bounding box coordinates and cumulate detection scores for each of the objects, thus obtaining a 352-dimensional representation. For instance, if the detector has detected 3 objects of class 8 with scores 0.1, 0.6 and 0.2, the 8th unit of the representation will contain the number 0.9.
I'll share code snippets to obtain such results soon.
Hello, I want to know how to train TSN on this problem to get "TSN-flow.pth.tar" and "TSN-rgb.pth.tar". Because in your extract code, only forward processing can be found to get special features. Can you tell me the training process, such as input and output? Thanks a lot!
Hello, We trained TSN using the PyTorch implementation provided by the authors, which can be found here: https://github.com/yjxiong/tsn-pytorch.
I see the authors released a new toolbox here https://github.com/open-mmlab/mmaction and suggest to switch to it. However, I'm not confindent with the latter and I don't know if its output is compatible with the rest of the code here.
Using the code in https://github.com/yjxiong/tsn-pytorch to train on your own dataset should require to format the data as suggested by the authors (maybe have a look at the original Caffe implementation here https://github.com/yjxiong/temporal-segment-networks).
After training, you should get the checkpoints for the RGB and Flow branches, which you should be able to use to extract features.
Best, Antonino
OK! I have a try. I really appreciate you.
Hello, I have a question on running code in "Faster Rcnn -> detect_video.py". I found cv2.VideoCapture didn't work very well. Usually, it is less than a number of total frames. Do you find this problem in your experiment?
Hello, I suppose this depends on the format you are using. In my case, I dont' recall this happening. However, I had previously re-encoded all videos at a fixed framerate of 30fps using ffmpeg. I used this command:
ffmpeg -i input.mp4 -c:v libx264 -crf 22 -r 30 -vsync cfr -an output.mp4
Could you try that and see if this solves the issue?
Otherwise, you could extract all frames to jpgs and hack the detect_video.py
file to read frames from jpeg files.
Antonino
Cool! Perfect solution by running ffmpeg command. Thanks s a lot!
Hello, I have a confusion on how to get a subset of frames at about 4fps mentioned in READEME. Since you have converted all videos to fix framerate of 30fps, I don't know how to resample frames to get a subset. Please tell me some details, thank you!
Hello, Depending on the parameters used to perform training/testing, the model will look only for a subset of the images. To avoid people to download the full set of features, we provided only a subset of it. This has been done by only copying the needed features and skipping the other ones, but it is not a required step to make the system work.
In general, it is sufficient to extract all features from the dataset and skip the subset part. This will also provide more flexibility to tune the parameters (encoding/anticipation steps).
Antonino
OK! As you mean, actually you extract each frame features offline, however, you only storage the needed features into mdb file to adapt the datasets building in dataset.py?
That's correct: I just stored the needed features and I did not modify dataset.py.
OK! Thanks s a lot!
Hi, Can you please tell me what is the configuration you are using for TSN? I'm using model from https://github.com/yjxiong/tsn-pytorch with use num_class = 2513, mode='RGB', num_segment = 1 and base_model='BNInception'. I'm able to load the pretrained weights provided by you using the mentioned config, however, I get the following error.
RuntimeError: size mismatch, m1: [1 x 16384], m2: [1024 x 2513] at /opt/conda/conda-bld/pytorch_1573049301898/work/aten/src/TH/generic/THTensorMath.cpp:197
Thanks!
Note: I had to change the input size to 224 x224 and it resolve the issue. Is it correct?
Hi,
While we started from the PyTorch TSN representation you mentioned, we ended up modifying it heavily. Hence, the checkpoints might not be 100% compatible. One of the changes has been the adoption of an updated model definition for the BNInception backbone from the pretrainedmodels
python package (https://github.com/Cadene/pretrained-models.pytorch).
You can find an example of how the model can be used for feature extraction here: https://github.com/fpv-iplab/rulstm/blob/master/FEATEXT/extract_example_rgb.py
You should be able to load the provided checkpoint if you define the backbone as shown in the example above.
Best, Antonino
Thank you! It works.
Thanks for your great work! I'm trying to leverage your method to another dataset, so I need to generate those three types of features for my own. Here are my questions: