Closed carlosedubarreto closed 1 year ago
From what I searche about this problem, it could be a problem with the learning rate. Where can I find this information to change? thanks
This happens when optimization struggles to converge in general. You can
change the learning rate in slahmr/confs/optim.yaml
.
We also had a recent update in the pre-processing pipeline, so double check
to make sure that all the preprocessing inputs (cameras, tracking) are in
the right place and are being accessed.
On Thu, Aug 3, 2023 at 10:00 AM Carlos Barreto @.***> wrote:
From what I searche about this problem, it could be a problem with the learning rate. Where can I find this information to change? thanks
— Reply to this email directly, view it on GitHub https://github.com/vye16/slahmr/issues/33#issuecomment-1664328705, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLOKW6B4QH6UQOS3ESK3TDXTPKKFANCNFSM6AAAAAA3C4QTPE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@vye16 thanks a lot for the answer. I think it might be the learning rate, because I already executed it more than 7 times and had problems only with 2 videos.
BTW, I'm showing this result in twitter and people are loving it. I'm so glad Georgious suggested this repo. Its amazing!!!!
I'll test changing the learning rate and I'll come back to say about the result.
Oh sorry to bother with this, but, can you suggest something that I should change? I went to the optim file, but there are so many things, that I dont know what to change.
or I can randomly choose a new value in any option? (considering if one value wil affect all others)
Here is the file that I was looking:
`optim: options: robust_loss_type: "bisquare" robust_tuning_const: 4.6851 joints2d_sigma: 100.0 lr: 1.0 lbfgs_max_iter: 20 save_every: 20 vis_every: -1 max_chunk_steps: 20 save_meshes: False
root: num_iters: 30
smpl: num_iters: 0
smooth: opt_scale: False num_iters: 60
motion_chunks: chunk_size: 10 init_steps: 20 chunk_steps: 20 opt_cams: True
loss_weights: joints2d: [0.001, 0.001, 0.001] bg2d: [0.0, 0.000, 0.000] cam_R_smooth : [0.0, 0.0, 0.0] cam_t_smooth : [0.0, 0.0, 0.0]
# cam_R_smooth : [0.0, 1000.0, 1000.0]
# cam_t_smooth : [0.0, 1000.0, 1000.0]
joints3d: [0.0, 0.0, 0.0]
joints3d_smooth: [1.0, 10.0, 0.0]
joints3d_rollout: [0.0, 0.0, 0.0]
verts3d: [0.0, 0.0, 0.0]
points3d: [0.0, 0.0, 0.0]
pose_prior: [0.04, 0.04, 0.04]
shape_prior: [0.05, 0.05, 0.05]
motion_prior: [0.0, 0.0, 0.075]
init_motion_prior: [0.0, 0.0, 0.075]
joint_consistency: [0.0, 0.0, 100.0]
bone_length: [0.0, 0.0, 2000.0]
contact_vel: [0.0, 0.0, 100.0]
contact_height: [0.0, 0.0, 10.0]
floor_reg: [0.0, 0.0, 0.0]
`
Hi, yes, I'd suggest changing the motion_chunks chunk_size (controls how many frames to successively optimize), init_steps (number of optimization steps to perform on the first chunk), and/or chunk_steps (number of optimization steps to perform per chunk). Reducing the chunk size and/or increasing the number of steps per chunk will make optimization slower, but will guide optimization toward better part of the state space before adding more frames, so I'd suggest trying that. If it still diverges, could you attach the video you're trying to process?
On Thu, Aug 3, 2023 at 2:53 PM Carlos Barreto @.***> wrote:
Oh sorry to bother with this, but, can you suggest something that I should change? I went to the optim file, but there are so many things, that I dont know what to change.
or I can randomly choose a new value in any option? (considering if one value wil affect all others)
Here is the file that I was looking:
`optim: options: robust_loss_type: "bisquare" robust_tuning_const: 4.6851 joints2d_sigma: 100.0 lr: 1.0 lbfgs_max_iter: 20 save_every: 20 vis_every: -1 max_chunk_steps: 20 save_meshes: False
root: num_iters: 30
smpl: num_iters: 0
smooth: opt_scale: False num_iters: 60
motion_chunks: chunk_size: 10 init_steps: 20 chunk_steps: 20 opt_cams: True
loss_weights: joints2d: [0.001, 0.001, 0.001] bg2d: [0.0, 0.000, 0.000] cam_R_smooth : [0.0, 0.0, 0.0] cam_t_smooth : [0.0, 0.0, 0.0]
bg2d: [0.0, 0.0001, 0.0001]
cam_R_smooth : [0.0, 1000.0, 1000.0]
cam_t_smooth : [0.0, 1000.0, 1000.0]
joints3d: [0.0, 0.0, 0.0] joints3d_smooth: [1.0, 10.0, 0.0] joints3d_rollout: [0.0, 0.0, 0.0] verts3d: [0.0, 0.0, 0.0] points3d: [0.0, 0.0, 0.0] pose_prior: [0.04, 0.04, 0.04] shape_prior: [0.05, 0.05, 0.05] motion_prior: [0.0, 0.0, 0.075] init_motion_prior: [0.0, 0.0, 0.075] joint_consistency: [0.0, 0.0, 100.0] bone_length: [0.0, 0.0, 2000.0] contact_vel: [0.0, 0.0, 100.0] contact_height: [0.0, 0.0, 10.0] floor_reg: [0.0, 0.0, 0.0] floor_reg: [0.0, 0.0, 0.167]
`
— Reply to this email directly, view it on GitHub https://github.com/vye16/slahmr/issues/33#issuecomment-1664691429, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLOKW4M5TJCYPZRTXWPY63XTQMWXANCNFSM6AAAAAA3C4QTPE . You are receiving this because you were mentioned.Message ID: @.***>
Wow, great I knew, it would be much simpler than I was thinking :)
Yep, I can show, here is the video
https://github.com/vye16/slahmr/assets/4061130/39f89c14-f35a-4823-a94a-04e4fdf65877
the error shows up in iterationg 76, I think. I'll make some changes and try again.
@vye16 ,out of curiosity, is it possible to change the learning rate while in process?
I was thinking that it could be checked to see if it will get an error, and if will, it could automatically reduce the learning rate, so it wont loose all the progress.
Is that possible in machine learning in general?
I was thinking on trying to implement that, but if it is an absurd idea there is no reason for me to start trying it.
And i have almost no experience with ML (Coding it)
I reduced the chunk_size from 10 to 7 and it worked thanks a lot!!!!
just out of curiosity, I dont know if it was coincidence, but this result was on of the worst I had from SLAHMR (the result from the video I sent)
I ran several times without problem, but sometimes it gives this sort of error in the middle of the processing
ValueError: Expected value argument (Tensor of shape (1, 138)) to be within the support (IndependentConstraint(Real(), 1)) of the distribution MixtureSameFamily( Categorical(probs: torch.Size([12]), logits: torch.Size([12])), MultivariateNormal(loc: torch.Size([12, 138]), covariance_matrix: torch.Size([12, 138, 138]))), but found invalid values: tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]], device='cuda:0', grad_fn=)
here is a print