spoonsso / dannce

MIT License
214 stars 30 forks source link

Bad predict result after finetune #151

Open zeng-zr opened 9 months ago

zeng-zr commented 9 months ago

I’m using 100 labeled frames from my videos(4 cameras, duplicated 1 to match weights for 5cam) to finetune a 5cam dannce MAX model following these advice: (https://github.com/spoonsso/dannce/issues/62#issuecomment-904950551) but the predict results are bad.

new_n_channels_out: 14
batch_size: 4
epochs: 600
net_type: AVG
train_mode:
new_n_channels_out: 14
batch_size: 4
epochs: 600
net_type: AVG
train_mode: 'finetune'
#dannce_finetune_weights: ./DANNCE/weights/weights.rat.AVG.MONO/ # doesn't work due to layers mismatch? try duplicating views manully
dannce_finetune_weights: ./DANNCE/weights/weights.rat.AVG.MONO.5cams/
# During prediction, will look for the last epoch weights saved to ./DANNCE/train_results/. To load in a different weights file, add the path here
# Note that this must be a FULL MODEL file, not just weights.
dannce_predict_model: './DANNCE/train_results/AVG_5cams/fullmodel_weights/fullmodel_end.hdf5'

predict_mode: torch
exp:
    - label3d_file: '1_0223_5cams_dannce.mat'
      com_file: './COM/predict_results/train_3cams/com3d.mat' #for dannce training
    - label3d_file: '2_0221_5cams_dannce.mat' 
      com_file:  './COM/predict_results/train_3cams/com3d.mat' # used 9000frames vid

com_file: './COM/predict_results/train_3cams/com3d.mat' 
num_validation_per_exp: 0
augment_brightness: True
n_rand_views: None
gpu_id: "2"
n_views: 5
comthresh: 0.2
loss: mask_nan_l1_loss
crop_height: [0, 2048]
crop_width: [0, 2432]
vol_size: 150
nvox: 96
max_num_samples: 1500
dannce_finetune_weights: ./DANNCE/weights/weights.rat.AVG.MONO.5cams/
dannce_predict_model: './DANNCE/train_results/AVG_5cams/fullmodel_weights/fullmodel_end.hdf5'

predict_mode: torch
exp:
    - label3d_file: '1_0223_5cams_dannce.mat'
      com_file: './COM/predict_results/train_3cams/com3d.mat' #for dannce training
    - label3d_file: '2_0221_5cams_dannce.mat' 
      com_file:  './COM/predict_results/train_3cams/com3d.mat' # used 9000frames video

com_file: './COM/predict_results/train_3cams/com3d.mat' 
num_validation_per_exp: 0
augment_brightness: True
n_rand_views: None
n_views: 5
comthresh: 0.2
loss: mask_nan_l1_loss
crop_height: [0, 2048]
crop_width: [0, 2432]
vol_size: 150
nvox: 96
max_num_samples: 1500
mono : True 

imagetraining.csv

I label 1 frame every 30 frames , and the above picture is the predict result of the first frame of the video, which I also labeled. Should I shorten the label frame interval to get better result?