AlchemyEmperor / SIAVC

Semi-Supervised Framework for Industrial Accident Video Classification
3 stars 0 forks source link

How to use your own dataset #1

Open srssrs0618 opened 1 month ago

srssrs0618 commented 1 month ago

I want to replace my own dataset in other fields. What adjustments should I make?

AlchemyEmperor commented 1 month ago

If you would like to use your own dataset, you can organize your data according to the structure provided on GitHub, dividing each video into training and testing sets in .mp4 or .pkl format. Then, set the desired classification categories on line 67 of SIAVC.py, and set the resolution for inputting the model on line 140. To start training, simply change the name of your own dataset on lines 233 to 236 of the code.

srssrs0618 commented 1 month ago

What graphics card did you use for training, and how long did the training take?

AlchemyEmperor commented 1 month ago

I used 7 NVIDIA 3090 GPUs for distributed training, and the current code uses the ViT/B-16 backbone. Training typically takes a few days. If memory is constrained, ResNet3D can be used as the backbone for faster training, although the baseline performance might decrease.

srssrs0618 commented 1 month ago

1 How to change the backbone? 2 Do I need to modify the name of line 233 every time I train a category? 3 I don't understand what line 140 means to set the resolution of the input model? 4 In addition to Python 3.9, do I need to install any other environment dependencies when using PyTorch 2.0.1? Thank you for your patient answer.

AlchemyEmperor commented 1 month ago

Answer 1: Some commonly used 3D models are placed in ./all_model/models, and you can call them using the generate_model function in each model's .py file. Just mimic the model definition from line 167 of SIAVC.py.

Answer 2: When changing the dataset, you only need to modify the dataset names from lines 233 to 236; the code can read all categories from that path (samples of one category are stored in one folder, so five categories require five folders).

Answer 3: In line 140 of SIAVC.py, you can manually set the dataloader to load the resolution of the videos, ensuring that your machine has enough GPU memory to load this data. If your GPU memory is insufficient, reduce it; otherwise, you can increase it. Compressing the video resolution may have some impact on performance.

Answer 4: I forgot the specific environments needed, but you can determine the required dependencies by debugging SIAVC.py; just repeat the process a few times.

srssrs0618 commented 1 week ago

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/wenzcao/anaconda3/envs/siavc/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index) File "/home/wenzcao/anaconda3/envs/siavc/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/wenzcao/anaconda3/envs/siavc/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/sdb1/srs/SIAVC-main/Load_Videos.py", line 270, in getitem buffer = self.loadvideo(self.fnames[index]) File "/sdb1/srs/SIAVC-main/Load_Videos.py", line 284, in loadvideo video = pk.load(Video_reader) _pickle.UnpicklingError: invalid load key, '\x00'.

0%| | 0/40 [00:00<?, ?it/s]

AlchemyEmperor commented 1 week ago

There could be several reasons for this error:

  1. The file specified by self.fnames[index] may not be a valid pickle file. Try manually loading it to confirm its contents.

  2. Ensure that self.fnames[index] points to the correct and existing file path. If the file doesn’t exist or the path is incorrect, it can lead to this error.

  3. If the file wasn't saved with pickle (for example, it's in a different format or is an empty file), pickle.load will fail. Confirm that the file was actually saved with pickle, or use an appropriate method to load it if it’s in another format.

srssrs0618 commented 2 days ago

Traceback (most recent call last): File "/sdb1/srs/SIAVC-main/SRS.py", line 819, in main() File "/sdb1/srs/SIAVC-main/SRS.py", line 291, in main train(args, labeled_train_dataloader, unlabeled_train_dataloader, strong_dataloader, weak_dataloader, test_dataloader, File "/sdb1/srs/SIAVC-main/SRS.py", line 571, in train history_dict[index].append(update_dict[flag]) IndexError: list index out of range