ubicomplab / rPPG-Toolbox

rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
https://arxiv.org/abs/2210.00716
Other
442 stars 106 forks source link

Issue when trying to get Baseline results with pre trained models testing on UBFC-rPPG dataset #219

Closed umutbrkee closed 10 months ago

umutbrkee commented 10 months ago

As a research project i need to get baseline results as a referane point close to this values Benchmarks I get this output when i preprocess my dataset 2 To overcome this issue i usually clean the CACHED_PATH and preprocess the dataset again. Most of the time its the same result. But sometimes it turns out like this:
3 Every time i get this last result after the preporocessing, test Preprocessed Dataset Length is :18

I don't know if I'm using the toolbox incorrectly or if there are errors in some places that I can't solve.

To be more clear my dataset is in correct format as shown in readme Also my PURE_UBFC-rPPG_TSCAN_BASIC.yaml file is as follows:

BASE: [''] TOOLBOX_MODE: "only_test" # "train_and_test" or "only_test" TEST: METRICS: ['MAE', 'RMSE', 'MAPE', 'Pearson', 'SNR', 'BA'] USE_LAST_EPOCH: True DATA: FS: 30 DATASET: UBFC-rPPG DO_PREPROCESS: True # if first time, should be true DATA_FORMAT: NDCHW DATA_PATH: "/home/umut/datasets/DATASET_2" # Raw dataset path, need to be updated CACHED_PATH: "/home/umut/datasets/uprocessed" # Processed dataset save path, need to be updated EXP_DATA_NAME: "" BEGIN: 0.0 END: 1.0 PREPROCESS: DATA_TYPE: [ 'DiffNormalized','Standardized' ] LABEL_TYPE: DiffNormalized DO_CHUNK: True CHUNK_LENGTH: 180 CROP_FACE: DO_CROP_FACE: True USE_LARGE_FACE_BOX: True LARGE_BOX_COEF: 1.5 DETECTION: DO_DYNAMIC_DETECTION: False DYNAMIC_DETECTION_FREQUENCY : 30 USE_MEDIAN_FACE_BOX: False # This should be used ONLY if dynamic detection is used RESIZE: H: 72 W: 72 DEVICE: cuda:0 NUM_OF_GPU_TRAIN: 1 LOG: PATH: runs/exp MODEL: DROP_RATE: 0.2 NAME: Tscan TSCAN: FRAME_DEPTH: 10 INFERENCE: BATCH_SIZE: 4 EVALUATION_METHOD: FFT # "FFT" or "peak detection" EVALUATION_WINDOW: USE_SMALLER_WINDOW: False # Change this if you'd like an evaluation window smaller than the test video length WINDOW_SIZE: 10 # In seconds MODEL_PATH: "./final_model_release/PURE_TSCAN.pth"

yahskapar commented 10 months ago

Hi @umutbrkee,

On first glance, my feeling is this has to do with some kind of instability with using multiple processing threads in the pre-processing step of the toolbox. What kind of computing environment are you working in terms of CPU resources? How many usable CPU cores do you have? Is your machine shared with others who are actively running other workloads?

Here's a simple thing that you can try right off the bat to rule out multiprocessing as the issue: change multi_process_quota in the below code in BaseLoader.py from 8 to 1.

https://github.com/ubicomplab/rPPG-Toolbox/blob/d992de556fbbd6c865e38d475e25ec4eccebb55e/dataset/data_loader/BaseLoader.py#L419

Even though this will still perform pre-processing through the multi-process manager, only a single process will be used. Let me know how this goes, and if it doesn't resolve anything I'll try to reply back and help you resolve this.

umutbrkee commented 10 months ago

First of all thank you very much for your incredibly fast response and hard work in your project. 4

The change you suggested seems to work. Now pre-processing is much slower but gives real results. Now that I can get the correct data I can begin my research. We will examine how rPPG Methods work in different situations (with various noise levels).

yahskapar commented 10 months ago

Happy to hear that worked, if later on you'd like to dig into why multi-processing is failing on your machine, feel free to reply back and we can walk-through some general troubleshooting steps.

All the best with your research!

EDIT: Also, I should note, you can gradually increase your multi_process_quota count to maybe 2 or 4 and see if pre-processing is still stable. That may speed things up a bit while preventing any weird behavior like you saw before with a setting of 8.

umutbrkee commented 10 months ago

I don't want to waste your time but I would like to know the reasons for this problem, if you want to help I would be grateful. Also trying to gradually increasing multi_process_quota was also in my mind thanks for the great tip!

girishvn commented 10 months ago

Hi @umutbrkee,

Thanks for using our toolkit!

Some helpful context:

Python, as a language, does not support true parallelism using threading (on a multi-core system), due to the Global Interpreter Lock, which limits the number of threads executing python byte-code to one.

As an alternative, we use the multiprocessing package, which spawns multiple processes, which each runs I/O and data preprocessing operations. The maximum number of possible processes spawned likely dependent on your compute device. It is possible that your device cannot support 8 processes for this task in a stable manner.

Because the stability issue is likely dependent on the under-lying compute device, we suggest you experiment with increasing the number of multi_process_quota to be as high as possible while still producing stable behavior.

Thanks, Girish

yahskapar commented 10 months ago

In addition to what @girishvn said, what kind of computer do you have? Windows? MacOS? A Linux distribution? If you're on Ubuntu or some other Linux distribution, you can use nproc to check the number of logical CPU cores that your computer has. On MacOS, a similar command is sysctl -n hw.logicalcpu, while with Windows you can discover this information by following the directions here.

Once you figure out the number of logical processors, which should match the kind of information you'd see in htop on a Linux distribution for example, it's worth using top, htop, or a similar utility on MacOS or Windows to monitor your CPU usage and see if the problem is that you have too much CPU usage and, therefore, the processes spun up by the multi-process manager silently fail and cause the weird behavior you saw before. Also, if you have an especially limited number of logical processors based on what's returned from nproc, such as 4, it makes sense that the default of 8 would not work for you (unless somehow the code spun up processes in such a way that 100% of the logical processor is not used for each process, which shouldn't be possible in the context of pre-processing as far as I know).

yahskapar commented 10 months ago

Also, in the near future, we will try to make it so that the number of logical processors is automatically detected and the default value of 8 is adjusted if 8 doesn't make sense for certain machines right off the bat.

umutbrkee commented 10 months ago

Thanks again for the guidance my logical cores 16. Also i use WSL: Ubuntu. I checked with command nproc for Ubuntu its also 16. I will be monitoring my CPU cores as you told to see if there is any problem i can observe.