Open NaeemKhan333 opened 4 years ago
Hi all, I am also want to train on my own dataset with SlowFast. The structure of my dataset is the same as @NaeemKhan333 but the frames are already extracted. Hope to hear from you soon.
Hey, I am also looking for instructions to train on a custom dataset. My custom dataset has only videos in directory. Kindly provide the documentation to train custom dataset.
Thanks!
I would also appreciate further documentation, i can't find a good guide yet on how to train and test with own videos using a pretrained model.
Hi @NaeemKhan333, @takatosp1 Any updates on this issue? I am also stuck at dataset preparation. Any progress will help me a lot! Thanks
Has anyone had any progress with this?
Yes! Let's say your dataset is named MyData
, you should structure your files and folders like this:
SlowFast/
├── configs/
│ └── MyData/
│ └── I3D_8x8_R50.yaml
├── data/
│ └── MyData/
│ ├── ClassA/
│ │ └── ins.mp4
│ ├── ClassB/
│ │ └── kep.mp4
│ ├── ClassC/
│ | └── tak.mp4
│ ├── train.csv
│ ├── test.csv
│ ├── val.csv
│ └── classids.json
├── slowfast/
│ └── datasets/
│ ├── __init__.py
│ ├── mydata.py
│ └── ...
└── ...
As you can see, you will need to create a config file (.yaml
), three files for the dataset split (.csv
), a file for referencing classes (.json
) and a file for dataset parsing (mydata.py
).
To create the python file mydata.py
, duplicate kinetics.py
which is located in the same folder, rename it to mydata.py
and replace all occurences of Kinetics
by Mydata
(search and replace, case-sensitive)
Once created, you will also need to import the newly created python file to the project by adding the line from .mydata import Mydata
to the __init__.py
file in the same folder.
The classids.json
file contains a mapping of class names and ids. It will look like the following:
{"ClassA": 0, "ClassB": 1, "ClassC": 2}
The .csv
files define which of the videos will be used for training, validation and inference testing, and which classes they reference.
They should look like the following:
/SlowFast/data/MyData/ClassA/ins.mp4 0
/SlowFast/data/MyData/ClassC/tak.mp4 2
For larger datasets, it will propably be the easiest to create files like this using an automated script that parses classids.json
and the folder structure.
Note that all three files should not share any identical lines (use the same video) and pay attention to your actual paths (absolute or relative paths can be used).
Last but not least, you will need to create a configuration file to train or test the network.
You can copy an existing one from the SlowFast/configs/Kinetics
folder, e.g. I3D_8x8_R50.yaml
.
In the copied file, replace all occurences of kinetics
by mydata
(case-sensitive).
You can run SlowFast with the new config that references your own dataset by running
python /SlowFast/tools/run_net.py --cfg /SlowFast/configs/MyData/I3D_8x8_R50.yaml
Note that you might need to adjust pathes to your actual working directory.
I hope this guide will help you guys using SlowFast with your own datasets :)
@AlexanderMelde thank you so much! This is beyond helpful
@AlexanderMelde Hi, i basically followed your instructions, but encounter abnormal logging, you can look at this issue, could you please give me some tips ?
then i want to run the demo with my trained model, do i just need to configure CHECKPOINT_FILE_PATH
with the path of trained model in the SLOWFAST_8x8_R50.yaml ? but i find its original format is .pkl, but the foramt of trained model is .pyth, so can i use .pyth for the CHECKPOINT_FILE_PATH
?
do i just need to configure
CHECKPOINT_FILE_PATH
with the path of trained model in the SLOWFAST_8x8_R50.yaml?
Yes, but instead you could also set TRAIN.AUTO_RESUME = True
.
but i find its original format is .pkl, but the format of trained model is .pyth, so can i use .pyth for the
CHECKPOINT_FILE_PATH
?
Yes you can use both formats, just set TRAIN.CHECKPOINT_TYPE = "pytorch"
.
@AlexanderMelde My case:
I use 4 print function
print("==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~")
print("preds is {}".format(preds.tolist()))
print("labels is {}".format(labels.tolist()))
num_topks_correct = metrics.topks_correct(preds, labels, (1))
print("preds.size(0) is {}".format(preds.size(0)))
print("num_topks_correct is {}".format(num_topks_correct))
top1_err= [(1.0 - x / preds.size(0)) * 100.0 for x in num_topks_correct][0]
in def train_epoch()
to check preds and labels, part of its output is :
[11/17 15:35:50][INFO] train_net.py: 419: Start epoch: 2
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.03573627024888992], [-0.32597339153289795]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.30879950523376465], [-0.02247714065015316]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[0.05403393507003784], [-0.18906450271606445]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.18617770075798035], [-0.16703137755393982]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[0.10825307667255402], [-0.292312890291214]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.0778438001871109], [-0.07582096755504608]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.30321329832077026], [-0.11427342891693115]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.09245844930410385], [-0.25378167629241943]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.2726392149925232], [0.0589011125266552]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
==============!!!!!!!!!!!!!!!!!!!!!!~~~~~~~~~~~~~~~~~
preds is [[-0.07824113965034485], [-0.22474029660224915]]
labels is [0, 0]
preds.size(0) is 2
num_topks_correct is [tensor(2., device='cuda:0')]
[11/17 15:35:59][INFO] logging.py: 96: json_stats: {"_type": "train_iter", "dt": 0.75530, "dt_data": 0.00388, "dt_net": 0.75142, "epoch": "2/10", "eta": "0:01:20", "gpu_mem": "2.78G", "iter": "10/13", "loss": 0.00000, "lr": 0.00972, "top1_err": 0.00000}
the predition is negtive, is it normal ? i didn't enable DETECTION in yaml.
my yaml is :
TRAIN:
ENABLE: True
DATASET: mydata
BATCH_SIZE: 2
EVAL_PERIOD: 10
CHECKPOINT_FILE_PATH: "./demo/Kinetics/SLOWFAST_8x8_R50.pkl"
CHECKPOINT_TYPE: caffe2
CHECKPOINT_PERIOD: 1
AUTO_RESUME: True
DATA:
NUM_FRAMES: 32
SAMPLING_RATE: 2
TRAIN_JITTER_SCALES: [256, 320]
TRAIN_CROP_SIZE: 224
TEST_CROP_SIZE: 256
INPUT_CHANNEL_NUM: [3, 3]
PATH_TO_DATA_DIR: "/media/weidawang/DATA/dataset/HMDB51/hmdb51_org/fall_floor"
PATH_LABEL_SEPARATOR: ","
SLOWFAST:
ALPHA: 4
BETA_INV: 8
FUSION_CONV_CHANNEL_RATIO: 2
FUSION_KERNEL_SZ: 7
RESNET:
ZERO_INIT_FINAL_BN: True
WIDTH_PER_GROUP: 64
NUM_GROUPS: 1
DEPTH: 50
TRANS_FUNC: bottleneck_transform
STRIDE_1X1: False
NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]]
SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [2, 2]]
SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [1, 1]]
NONLOCAL:
LOCATION: [[[], []], [[], []], [[], []], [[], []]]
GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]]
INSTANTIATION: dot_product
BN:
USE_PRECISE_STATS: True
NUM_BATCHES_PRECISE: 200
SOLVER:
BASE_LR: 0.0125
LR_POLICY: cosine
MAX_EPOCH: 10
MOMENTUM: 0.9
WEIGHT_DECAY: 1e-4
WARMUP_EPOCHS: 34.0
WARMUP_START_LR: 0.01
OPTIMIZING_METHOD: sgd
MODEL:
NUM_CLASSES: 1
ARCH: slowfast
MODEL_NAME: SlowFast
LOSS_FUNC: cross_entropy
DROPOUT_RATE: 0.5
TEST:
ENABLE: True
DATASET: mydata
BATCH_SIZE: 2
DATA_LOADER:
NUM_WORKERS: 8
PIN_MEMORY: True
NUM_GPUS: 1
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: .
- only one action class : "fall_down"
MODEL: NUM_CLASSES: 1
I think you need to use at least two classes. Try creating a second called "neutral", "other" or something, with videos of other actions then falling down. Else the classificator has to always choose "fall_down" as prediction, and all evaluation / accuracy metrics etc. (needed during training) will fail or not be representative.
@AlexanderMelde Thank you ! You are right, now i increased action classes number to be 6, and retraining, the "top1_err" and "top5_err" are non-zero now . And i am curious about the "dt": 0.74047, "dt_data": 0.01799, "dt_net": 0.72248, could you please tell me what are they ?
@wwdok As a general suggestion: You can find out some meanings like this one by searching inside the repository. Searching for dt_data
leads to the code line "dt_data": self.data_timer.seconds()
, so i will assume this is a timer, propably describing the time needed for data parsing etc., you could look further in the source code and the usage of data_timer
if you are interested,
@AlexanderMelde Looking back at that question, it really feels stupid : ). I used to search keywords in this repository, but yesterday not... After a rough look, I guess dt_data
means the number of seconds it takes to process the data, dt_net
means the number of seconds it takes to process the network, and dt
means the number of seconds that a whole iteration takes. it equals dt_data
+ dt_net
.
Final question, could you please help to look at this issue ?
'nvidia-smi' is not recognized as an internal or external command,
operable program or batch file.
[12/18 22:30:01][INFO] mydata.py: 73: Constructing mydata train...
Traceback (most recent call last):
File "./tools/run_net.py", line 42, in
Can anyone help me? I also follow the method as above the Mydata.py
'nvidia-smi' is not recognized as an internal or external command, operable program or batch file. [12/18 22:30:01][INFO] mydata.py: 73: Constructing mydata train... Traceback (most recent call last): File "./tools/run_net.py", line 42, in main() File "./tools/run_net.py", line 23, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\utils\misc.py", line 297, in launch_job func(cfg=cfg) File "C:\Users\User\Desktop\video search\pyslowfast\slowfast\tools\train_net.py", line 392, in train train_loader = loader.construct_loader(cfg, "train") File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\loader.py", line 83, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\build.py", line 31, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 74, in init self._construct_loader() File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 93, in _construct_loader len(path_label.split(self.cfg.DATA.PATH_LABEL_SEPAR-ATOR)) File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\site-packages\yacs\config.py", line 141, in getattr raise AttributeError(name) AttributeError: PATH_LABEL_SEPAR
Can anyone help me? I also follow the method as above the Mydata.py
This means you have an issue with your CUDA driver installation. This is unrelated to SlowFast.
Does your computer have an Nvidia GPU?
'nvidia-smi' is not recognized as an internal or external command, operable program or batch file. [12/18 22:30:01][INFO] mydata.py: 73: Constructing mydata train... Traceback (most recent call last): File "./tools/run_net.py", line 42, in main() File "./tools/run_net.py", line 23, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\utils\misc.py", line 297, in launch_job func(cfg=cfg) File "C:\Users\User\Desktop\video search\pyslowfast\slowfast\tools\train_net.py", line 392, in train train_loader = loader.construct_loader(cfg, "train") File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\loader.py", line 83, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\build.py", line 31, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 74, in init self._construct_loader() File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 93, in _construct_loader len(path_label.split(self.cfg.DATA.PATH_LABEL_SEPAR-ATOR)) File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\site-packages\yacs\config.py", line 141, in getattr raise AttributeError(name) AttributeError: PATH_LABEL_SEPAR Can anyone help me? I also follow the method as above the Mydata.py
This means you have an issue with your CUDA driver installation. This is unrelated to SlowFast.
Does your computer have an Nvidia GPU?
No. Could I need GPU to run slow-fast?
'nvidia-smi' is not recognized as an internal or external command, operable program or batch file. [12/18 22:30:01][INFO] mydata.py: 73: Constructing mydata train... Traceback (most recent call last): File "./tools/run_net.py", line 42, in main() File "./tools/run_net.py", line 23, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\utils\misc.py", line 297, in launch_job func(cfg=cfg) File "C:\Users\User\Desktop\video search\pyslowfast\slowfast\tools\train_net.py", line 392, in train train_loader = loader.construct_loader(cfg, "train") File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\loader.py", line 83, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\build.py", line 31, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 74, in init self._construct_loader() File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 93, in _construct_loader len(path_label.split(self.cfg.DATA.PATH_LABEL_SEPAR-ATOR)) File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\site-packages\yacs\config.py", line 141, in getattr raise AttributeError(name) AttributeError: PATH_LABEL_SEPAR Can anyone help me? I also follow the method as above the Mydata.py
This means you have an issue with your CUDA driver installation. This is unrelated to SlowFast. Does your computer have an Nvidia GPU?
No. Could I need GPU to run slow-fast?
I haven't tried to, but it seems like you can run a pre-trained model on a CPU. See #103
'nvidia-smi' is not recognized as an internal or external command, operable program or batch file. [12/18 22:30:01][INFO] mydata.py: 73: Constructing mydata train... Traceback (most recent call last): File "./tools/run_net.py", line 42, in main() File "./tools/run_net.py", line 23, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\utils\misc.py", line 297, in launch_job func(cfg=cfg) File "C:\Users\User\Desktop\video search\pyslowfast\slowfast\tools\train_net.py", line 392, in train train_loader = loader.construct_loader(cfg, "train") File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\loader.py", line 83, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\build.py", line 31, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 74, in init self._construct_loader() File "c:\users\user\desktop\video search\pyslowfast\slowfast\slowfast\datasets\mydata.py", line 93, in _construct_loader len(path_label.split(self.cfg.DATA.PATH_LABEL_SEPAR-ATOR)) File "C:\Users\User\AppData\Local\Programs\Python\Python37\lib\site-packages\yacs\config.py", line 141, in getattr raise AttributeError(name) AttributeError: PATH_LABEL_SEPAR Can anyone help me? I also follow the method as above the Mydata.py
This means you have an issue with your CUDA driver installation. This is unrelated to SlowFast. Does your computer have an Nvidia GPU?
No. Could I need GPU to run slow-fast?
I haven't tried to, but it seems like you can run a pre-trained model on a CPU. See #103
Okay, Thank you
[12/19 12:28:53][INFO] misc.py: 170: Params: 33,649,098
[12/19 12:28:53][INFO] misc.py: 171: Mem: 0.1263117790222168 MB
[12/19 12:28:53][WARNING] flop_count.py: 65: Skipped operation aten::batch_norm 110 time(s)
[12/19 12:28:53][WARNING] flop_count.py: 65: Skipped operation aten::max_pool3d 4 time(s)
[12/19 12:28:53][WARNING] flop_count.py: 65: Skipped operation aten::add 32 time(s)
[12/19 12:28:53][WARNING] flop_count.py: 65: Skipped operation aten::avg_pool3d 2 time(s)
[12/19 12:28:53][WARNING] flopcount.py: 65: Skipped operation aten::add 1 time(s)
[12/19 12:28:53][WARNING] flop_count.py: 65: Skipped operation aten::softmax 1 time(s)
[12/19 12:28:53][WARNING] flop_count.py: 65: Skipped operation aten::mean 1 time(s)
[12/19 12:28:53][INFO] misc.py: 174: Flops: 50.307666432 G
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::batch_norm 110 time(s)
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::max_pool3d 4 time(s)
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::add 32 time(s)
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::avg_pool3d 2 time(s)
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::matmul 1 time(s)
[12/19 12:28:53][WARNING] activationcount.py: 56: Skipped operation aten::add 1 time(s)
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::softmax 1 time(s)
[12/19 12:28:53][WARNING] activation_count.py: 56: Skipped operation aten::mean 1 time(s)
[12/19 12:28:53][INFO] misc.py: 179: Activations: 136.579072 M
[12/19 12:28:53][INFO] misc.py: 182: nvidia-smi
Sat Dec 19 12:28:53 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.45.01 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 35C P0 26W / 70W | 2495MiB / 15079MiB | 39% Default |
| | | ERR! |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
[12/19 12:28:54][INFO] mydata.py: 73: Constructing mydata train...
Traceback (most recent call last):
File "tools/run_net.py", line 42, in
I run my code in colab with GPU it gives the same error. Can anyone help me?
Hey guys can I ask a question here? What is the purpose of the "ava_train_predicted_boxes.csv" file?
Hello guys, I am trying to train the slowfast model on my custom dataset. I conveted all my dataset to ava format. I have a question regarding the detection architecture . I dont have predicted boxes score . Do I need to train the person detector seperately and than I need to train the slowfast model ?. Please any know help me in this regard.
Yes! Let's say your dataset is named
MyData
, you should structure your files and folders like this:SlowFast/ ├── configs/ │ └── MyData/ │ └── I3D_8x8_R50.yaml ├── data/ │ └── MyData/ │ ├── ClassA/ │ │ └── ins.mp4 │ ├── ClassB/ │ │ └── kep.mp4 │ ├── ClassC/ │ | └── tak.mp4 │ ├── train.csv │ ├── test.csv │ ├── val.csv │ └── classids.json ├── slowfast/ │ └── datasets/ │ ├── __init__.py │ ├── mydata.py │ └── ... └── ...
As you can see, you will need to create a config file (
.yaml
), three files for the dataset split (.csv
), a file for referencing classes (.json
) and a file for dataset parsing (mydata.py
).To create the python file
mydata.py
, duplicatekinetics.py
which is located in the same folder, rename it tomydata.py
and replace all occurences ofKinetics
byMydata
(search and replace, case-sensitive)Once created, you will also need to import the newly created python file to the project by adding the line
from .mydata import Mydata
to the__init__.py
file in the same folder.The
classids.json
file contains a mapping of class names and ids. It will look like the following:{"ClassA": 0, "ClassB": 1, "ClassC": 2}
The
.csv
files define which of the videos will be used for training, validation and inference testing, and which classes they reference.They should look like the following:
/SlowFast/data/MyData/ClassA/ins.mp4 0 /SlowFast/data/MyData/ClassC/tak.mp4 2
For larger datasets, it will propably be the easiest to create files like this using an automated script that parses
classids.json
and the folder structure.Note that all three files should not share any identical lines (use the same video) and pay attention to your actual paths (absolute or relative paths can be used).
Last but not least, you will need to create a configuration file to train or test the network. You can copy an existing one from the
SlowFast/configs/Kinetics
folder, e.g.I3D_8x8_R50.yaml
.In the copied file, replace all occurences of
kinetics
bymydata
(case-sensitive).You can run SlowFast with the new config that references your own dataset by running
python /SlowFast/tools/run_net.py --cfg /SlowFast/configs/MyData/I3D_8x8_R50.yaml
Note that you might need to adjust pathes to your actual working directory.
I hope this guide will help you guys using SlowFast with your own datasets :)
Question: If instead of .mp4 files, I have folders of .jpgs (the extracted frames of the .mp4), what needs to be adjusted for this?
Simplest solution if suitable for your amount of data: Short opencv script or ffmpeg command to parse each folders jpg and convert them in videos.
Simplest solution if suitable for your amount of data: Short opencv script or ffmpeg command to parse each folders jpg and convert them in videos.
That's a fair suggestion. I might try it, but not sure if it'll work. Dealing with +10k clips
Hey, Did you succeed training with bounding boxes? If yes, what is required bounding box format?
edit of my previous comment
Hello, I followed @AlexanderMelde 's instructions for using my own dataset. Thank you very much for the explanation!
I'm dealing with a bug when trying to run, and get num_samples = 0 so I get this error:
I guess my videos inputs dosen't open correctly so I have no batch-size nor num_samples. It seems that the parsing part works properly, but the data loading doesn't and as a result the program crashes when trying to train the model with the data.
my Json file contains two classes = 0 and 1 Videos are in mp4 format, but I didn't do the preprocessing yet. I currently have just two videos for every csv I created, and I will add more in the future. I currently have no GPU (I assigned NUM_GPU = 0)
Can you please help me? Thanks in advance
i think it might be best to create a seperate issue and edit your post here to keep it readable.
i think it might be best to create a seperate issue and edit your post here to keep it readable.
I edited it just now. Thank you! @AlexanderMelde
@AlexanderMelde I wonder what is the exact preprocess I should use on my own dataset. Have you done any preprocessing on your dataset when you succeeded training it? I will be very helpful Thanks again!
i have done no particular preprocessing, but i put my videos in a certain folder structure (all videos for one class in one folder each, see post above) and converted some of the older video file formats to modern h264 mp4 files to ensure faster processing times and less file storage requirements.
@AlexanderMelde Thank you so much for answering.
I have a project in which I want to use neural network for video classification (with two classes only). My dataset is a 300 WMV videos of approximately 30 seconds each, size: 640 width and 480 length.
Right now, I don't know how to split my videos into frames - and if I should use the preprocessing guidance that was given in the git explanations of Kinetics? or AVA? or something else? Thanks again.
If you use the AVA format discussed in this thread, there is no need to manually split the wmv files to single frames. I would convert them mp4 though for convenience, you could use ffmpeg for that.
@AlexanderMelde It was very helpful, Thank you!!
@AlexanderMelde do you think this format of AVA is the best for classification (not detection) with my own data?
I can't say that without a closer look and further inspection, but i think you are good to go with the AVA format. I also used it for action classification.
Thank you very much!
@AlexanderMelde First off, thanks for all your support in this repository. I managed to train SlowFast on my own dataset with the help of what you provided above. However, when I created my config according to the Kinetics dataset (since I want to do action recognition, and don't care about the AVA format) in the SlowFast output log I still get:
[03/01 17:04:58][INFO] train_net.py: 377: Train with config: [03/01 17:04:58][INFO] train_net.py: 378: {'AVA': {'ANNOTATION_DIR': '/mnt/vol/gfsai-flash3-east/ai-group/users/haoqifan/ava/frame_list/', 'BGR': False, 'DETECTION_SCORE_THRESH': 0.9, 'EXCLUSION_FILE': 'ava_val_excluded_timestamps_v2.2.csv', 'FRAME_DIR': '/mnt/fair-flash3-east/ava_trainval_frames.img/', 'FRAME_LIST_DIR': '/mnt/vol/gfsai-flash3-east/ai-group/users/haoqifan/ava/frame_list/', 'FULL_TEST_ON_VAL': False, 'GROUNDTRUTH_FILE': 'ava_val_v2.2.csv', 'IMG_PROC_BACKEND': 'cv2', 'LABEL_MAP_FILE': 'ava_action_list_v2.2_for_activitynet_2019.pbtxt', 'TEST_FORCE_FLIP': False, 'TEST_LISTS': ['val.csv'], 'TEST_PREDICT_BOX_LISTS': ['ava_val_predicted_boxes.csv'], 'TRAIN_GT_BOX_LISTS': ['ava_train_v2.2.csv'], 'TRAIN_LISTS': ['train.csv'], 'TRAIN_PCA_EIGVAL': [0.225, 0.224, 0.229], 'TRAIN_PCA_EIGVEC': [[-0.5675, 0.7192, 0.4009], [-0.5808, -0.0045, -0.814], [-0.5836, -0.6948, 0.4203]], 'TRAIN_PCA_JITTER_ONLY': True, 'TRAIN_PREDICT_BOX_LISTS': [], 'TRAIN_USE_COLOR_AUGMENTATION': False},
and Similarly I get
'DEMO': {'BUFFER_SIZE': 0, 'CLIP_VIS_SIZE': 10, 'COMMON_CLASS_NAMES': ['watch (a person)', 'talk to (e.g., self, a person, a group)', 'listen to (a person)', 'touch (an object)', 'carry/hold (an object)', 'walk', 'sit', 'lie/sleep', 'bend/bow (at the waist)'], 'COMMON_CLASS_THRES': 0.7, 'DETECTRON2_CFG': 'COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml', 'DETECTRON2_THRESH': 0.9, 'DETECTRON2_WEIGHTS': 'detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl',
In the first case, I understand that the "AVA" paths are because of what is present in slowfast/config/defaults.py
. Since I do not use anything related to "AVA" I'm confused as to why this appears in the log.
For the second case, I see that Detectron2 weights are downloaded, but I don't need any form of segmentation. Additionally, the COMMON_CLASS_NAMES
are not the ones that I have in data/MyData/classes.txt
Since the model gets very bad results on my own data (15% top-1, whereas TSM gets 65% top-1 with the same dataset), I wonder whether I am missing some essential details.
If the config that is printed to the console does not represent your own config, something went wrong with defining which config to use. Maybe it is using a default config and not yours. What code do you use to run the scripts, and which command line parameters do you use when calling the scripts?
@AlexanderMelde I took the exact same config as "Kinetics Slowfast 4x16" and changed the kinetics dataset to the name of MyData. In the "DATA"
section it refers correctly to the path of my dataset, and all the other configs that were given in my config file are correct. What I am confused about is in the stdout.log
that it loads the "ANNOTATION_DIR"
and "DEMO"
from slowfast/config/defaults.py
because those are never needed anyway. Also, the paths that are given in the "ANNOTATION_DIR" do not exist on my machine and it doesn't throw an error, which makes me think those are not used.
You are right, the values you mentioned are defined in the defaults file. You can try explicitly setting these values to prevent the defaults from being used. Unfortunately i am no longer working with the framework at the moment so I am not sure if these values are needed or not. It is weird though that it is looking for AVA values when you are trying to do kinetics. Maybe you have at some point in the config enabled features that only work with AVA. For example, in the current version of the train_net.py i see code like this:
if cfg.DETECTION.ENABLE:
train_meter = AVAMeter(len(train_loader), cfg, mode="train")
val_meter = AVAMeter(len(val_loader), cfg, mode="val")
else:
train_meter = TrainMeter(len(train_loader), cfg)
val_meter = ValMeter(len(val_loader), cfg)
This means if DETECTION is enabled in the config, an AVA component is used. Without looking further into it right now, it might be something like this that makes the framework look for AVA specific config values.
You can try removing additional config changes step by step to see if a minimal example will work.
@AlexanderMelde Thanks a lot for the assistance! Will look into my config again
Hi @AlexanderMelde Thank you for your contribution, I am following your instructions, however I am running into a filepath error that I cant get past.
[03/08 13:48:48][INFO] mydata.py: 73: Constructing Mydata train... Traceback (most recent call last): File "tools/run_net.py", line 44, in <module> main() File "tools/run_net.py", line 25, in main launch_job(cfg=cfg, init_method=args.init_method, func=train) File "/content/SlowFast/slowfast/utils/misc.py", line 297, in launch_job func(cfg=cfg) File "/content/SlowFast/tools/train_net.py", line 392, in train train_loader = loader.construct_loader(cfg, "train") File "/content/SlowFast/slowfast/datasets/loader.py", line 83, in construct_loader dataset = build_dataset(dataset_name, cfg, split) File "/content/SlowFast/slowfast/datasets/build.py", line 31, in build_dataset return DATASET_REGISTRY.get(name)(cfg, split) File "/content/SlowFast/slowfast/datasets/mydata.py", line 74, in __init__ self._construct_loader() File "/content/SlowFast/slowfast/datasets/mydata.py", line 84, in _construct_loader path_to_file AssertionError: train.csv dir not found
]
Not sure why,, my files are in the exact same structure as yours yet, my csv files are not being read. Some help is appreciated.
Can you print the file path it tries to open via a print statement or the debugger please?
@AlexanderMelde I don't quite get how to get the debugger report. Sorry I am new to all this.
@AlexanderMelde Hi, thank to your help, my code is running completely with my own dataset. I encountered another problem now - after running the program with the same videos in the train, val and test sets (in order to get overfitting) to see if the program works well for a first step. Each set contains 4 videos (2 of each class, I have 2 classes in total). I used this configuration:
--cfg /cs/ep/116/final_project_SlowFast/SlowFast/configs/MyData/I3D_8x8_R50.yaml NUM_GPUS 1 TRAIN.BATCH_SIZE 2 TEST.BATCH_SIZE 2 MODEL.NUM_CLASSES 2 SOLVER.MAX_EPOCH 3 DATA.PATH_TO_DATA_DIR /cs/ep/116/final_project_SlowFast/SlowFast/data/MyData
and I get as a result (after training and testing)
json_stats: {"split": "test_final", "top1_acc": "50.00", "top2_acc": "100.00"}
Obviously this is not correct because if I have overfit I should get "top1_acc" of 100.00
I didn't changed anything in the code itself except the number of k from 5 to 2 here:
I will appreciate your help, Thank you very much.
@AlexanderMelde
i have done no particular preprocessing, but i put my videos in a certain folder structure (all videos for one class in one folder each, see post above) and converted some of the older video file formats to modern h264 mp4 files to ensure faster processing times and less file storage requirements.
Regarding this, I have a new set of videos that I converted from frames to mp4 with H.264 codec. When I try to use these with I3D, I get an Assertion error, however when I use videos with mpeg 4 codec, then there is no issue. Did you have to change anything in the code for PyAV to accept H.264?
I encountered another problem now - after running the program with the same videos in the train, val and test sets (in order to get overfitting) to see if the program works well for a first step.
Assuming that
Each set contains 4 videos (2 of each class, I have 2 classes in total).
means you have 4*3 = 12 different videos in total, there should be no overfitting.
top_k means the correct result is in the k first most propable predictions. If you test 4 videos and top_1 = 0.5, it means that 2 videos are correctly classified and with top_2=1 it means that for all 4 videos the actual class is in the top-2 predictions. If you only have two classes and 4 videos, this means 2 videos are correctly predicted and 2 are falsely predicted / only on second place. I dont't see an error there. Keep in mind the algorithm is designed for much greater number of videos or classes.
Regarding this, I have a new set of videos that I converted from frames to mp4 with H.264 codec. When I try to use these with I3D, I get an Assertion error, however when I use videos with mpeg 4 codec, then there is no issue. Did you have to change anything in the code for PyAV to accept H.264?
I don't remember if i changed something there, and unfortunately can't see the code any more. Maybe you need to update your ffmpeg version. Can your regular ffmpeg handle H.264 files?
@AlexanderMelde
means you have 4*3 = 12 different videos in total, there should be no overfitting.
Evan if I have the exact same videos in my test set and in my train set?
Thank you.
Thanks for nice work . I have seen the "demo.gif" which is the output of the model which is trained on the "AVA-Dataset" .Now I want to convert my custom dataset into "AVA-Dataset Format" and want to train a model using your given code . Can you guide me what are pre-processing steps I need to do to convert my own custom dataset into the AVA Dataset Format. Can you give me brief idea or any tools which help me to achieve the "AVA-Dataset Format" . Thanks
I have data set now as follow