WENGSYX / VPTSL

Apache License 2.0
4 stars 1 forks source link

How can I get my own features file(.npy) with my own datasets? #2

Open nkldy22 opened 1 month ago

nkldy22 commented 1 month ago

Hi ! I want to train this model with my own datasets, which have the same format with the MedVidQA ,the only diffence between them is the Video ID .So how can I get features files of my datasets ? (The text file have already been prepared)

WENGSYX commented 1 month ago

Hi!

You can fine prepare code in: the https://github.com/WENGSYX/VPTSL/tree/main/video_prepare

If you only have the video ID (without the mp4 file), you need to download it with: https://github.com/WENGSYX/VPTSL/blob/main/video_prepare/extract_medvidqa.py#L157

nkldy22 commented 1 month ago

Why did my GPU memory overflow when I trained with my own dataset? Why did the GPU memory consumption suddenly increase when I switched to a different dataset? I only changed the feature and text to my own data without altering anything else.

WENGSYX commented 1 month ago

You can check your subtitle file; it might be too long. There are some solutions: for the maxlen in argparse, you can set it smaller, which will automatically discard some data samples that exceed the limit, thereby reducing GPU memory usage. Alternatively, you can use the microsoft/deberta-v3-base model instead of microsoft/deberta-v3-large.