Closed BugMaker2002 closed 5 months ago
The problem could happen in the post-processing or the inputs. Since you use the same weights, the model should be the same in your two tests. Please check your input frames whether they are the same in your two tests. Please also check your post-processing is the same like filtering and heart rate calculation.
I set train_exp_num=2
here becausetrain_exp_num=2
is the train I run a week ago with normal results, so the test set I used was also divided for that run. The model weight I used was the 29th epoch of that training, which is also the epoch used in that test. As in the following code, the results are terrible and completely irrational, as shown in the following figure: (I can guarantee that my code for r,mae,rmse has not changed)
ex = Experiment('model_pred', save_git_info=False)
@ex.config
def my_config():
e = 29 # the model checkpoint at epoch e
train_exp_num = 2 # the training experiment number
train_exp_dir = './results/%d'%train_exp_num # training experiment directory
# 这里为了适应transformer架构,我将时间从30s改成了10s
time_interval = 30 # get rppg for 30s video clips, too long clips might cause out of memory
ex.observers.append(FileStorageObserver(train_exp_dir))
if torch.cuda.is_available():
device = torch.device('cuda')
torch.backends.cudnn.enabled = True
torch.backends.cudnn.benchmark = True
else:
device = torch.device('cpu')
@ex.automain
def my_main(_run, e, train_exp_dir, device, time_interval):
mae_loss_func = nn.L1Loss().to(device)
mse_loss_func = nn.MSELoss().to(device)
# load test file paths
test_list = list(np.load(train_exp_dir + '/test_list.npy'))
pred_exp_dir = train_exp_dir + '/%d'%(int(_run._id)) # prediction experiment directory
with open(train_exp_dir+'/config.json') as f:
config_train = json.load(f)
model = PhysNet(config_train['S'], config_train['in_ch']).to(device).eval()
model.load_state_dict(torch.load(train_exp_dir+'/epoch%d.pt'%(e), map_location=device)) # load weights to the model
@torch.no_grad()
def dl_model(imgs_clip):
# model inference
img_batch = imgs_clip
img_batch = img_batch.transpose((3,0,1,2))
# 在img_batch前面新增了一个批量大小的维度(批量大小为1)
img_batch = img_batch[np.newaxis].astype('float32')
img_batch = torch.tensor(img_batch).to(device)
rppg = model(img_batch)[:,-1, :] # (1, 5, T) -> (1, T)
rppg = rppg[0].detach().cpu().numpy()
return rppg
for h5_path in test_list:
h5_path = str(h5_path)
with h5py.File(h5_path, 'r') as f:
imgs = f['imgs']
subject_name = os.path.basename(h5_path)[:-3]
bvp_path = f"/share2/data/zhouwenqing/UBFC_rPPG/dataset2/{subject_name}/ground_truth.txt"
bvp = np.loadtxt(bvp_path).reshape((-1, 1))
# bvppeak = f['bvp_peak']
fs = config_train['fs']
# duration表示秒数,fs表示frame per seccond
duration = np.min([imgs.shape[0], bvp.shape[0]]) / fs
num_blocks = int(duration // time_interval)
# 从整个视频当中截取出num_blocks个视频片段,这些片段之间是连续的(指从原视频当中截取的方式)
rppg_list = []
bvp_list = []
# bvppeak_list = []
for b in range(num_blocks):
rppg_clip = dl_model(imgs[b*time_interval*fs:(b+1)*time_interval*fs])
rppg_list.append(rppg_clip)
bvp_list.append(bvp[b*time_interval*fs:(b+1)*time_interval*fs])
# bvppeak_list.append(bvppeak[b*time_interval*fs:(b+1)*time_interval*fs])
rppg_list = np.array(rppg_list)
bvp_list = np.array(bvp_list)
# bvppeak_list = np.array(bvppeak_list)
# results = {'rppg_list': rppg_list, 'bvp_list': bvp_list, 'bvppeak_list':bvppeak_list}
results = {'rppg_list': rppg_list, 'bvp_list': bvp_list}
np.save(pred_exp_dir+'/'+h5_path.split('/')[-1][:-3], results)
bvp_list = bvp_list.reshape(num_blocks, -1)
hr_pred = torch.tensor(rppg_list)
hr_gt = torch.tensor(bvp_list)
mae_all = mae_loss_func(hr_pred, hr_gt)
mse_all = mse_loss_func(hr_pred, hr_gt)
rmse = np.sqrt(mse_all)
correlation_coefficients = np.corrcoef(rppg_list, bvp_list)[0, 3]
print("Evaluation Result\n MAE: {:.4f}; RMSE: {:.4f}; R: {:.4f};".format(
mae_all, rmse, correlation_coefficients))
In addition, I tested using thesubject.npy
file under the directory results/2/5
, which I saved a week ago when I ran the test.py file for the predicted and true values of each subject. I evaluated r,mae,rmse, etc. for each subject.npy
and found that the results are normal, so what is the problem?
Hi, it seems for each video you get a rmse, mae, and r. This calculation is not correct. You should get gt_hr and pred_hr from all video clips (30s), and then calculate rmse ,mae and r. Also, you calculate the person correlation between the bvp waveform and rppg waveform. The correct Pearson correlation should be between gt_hr and pred_hr.
A week ago, I trained a model that worked well on the test set, I saved the corresponding weights, but now when I used the weights again to inference (the test set did not change), I found that the effect was poor, why?
In addition, when I ran test.py a week ago, I saved the predicted and true values of each subject in a.npy file, and now when I evaluate r,mae,rmse directly on this.npy file, the result is normal (very good). But now when I re-predict with the model weights that I used at that time, and find r,mae,rmse, the result becomes very bad
Moreover, I re-downloaded the code of the official website, and started training and testing again, and found that the effect was still very poor.
I wonder, what is going on here?