mbrossar / ai-imu-dr

AI-IMU Dead-Reckoning
MIT License
788 stars 222 forks source link

unmatched iekfnets.p from dropbox #82

Open enguang2 opened 1 year ago

enguang2 commented 1 year ago

Hello, The iekfnets.p file downloaded from wget "https://www.dropbox.com/s/77kq4s7ziyvsrmi/temp.zip" does not correspond with the MesNet structure. The fourth layer has different size and apparently the iekfnets.p contains more layes than the 8 layes' MesNet we see in utils_torch_filter.py

ccsmm78 commented 1 year ago

I've met the same problem! as followings:

RuntimeError: Error(s) in loading state_dict for TORCHIEKF: Unexpected key(s) in state_dict: "mes_net.cov_net.8.weight", "mes_net.cov_net.8.bias", "mes_net.cov_net.12.weight", "mes_net.cov_net.12.bias", "mes_net.cov_net.16.weight", "mes_net.cov_net.16.bias". size mismatch for mes_net.cov_net.4.weight: copying a param with shape torch.Size([64, 32, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 5]). size mismatch for mes_net.cov_net.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).

/home/ccsmm/git_repository/ai-imu-dr/src/utils_torch_filter.py(466)load() 465 mondict = torch.load(path_iekf) --> 466 self.load_state_dict(mondict) 467 cprint("IEKF nets loaded", 'green')

In debugging, ipdb> mondict.keys() odict_keys(['initprocesscov_net.factor_initial_covariance.weight', 'initprocesscov_net.factor_process_covariance.weight', 'mes_net.cov_net.0.weight', 'mes_net.cov_net.0.bias', 'mes_net.cov_net.4.weight', 'mes_net.cov_net.4.bias', 'mes_net.cov_net.8.weight', 'mes_net.cov_net.8.bias', 'mes_net.cov_net.12.weight', 'mes_net.cov_net.12.bias', 'mes_net.cov_net.16.weight', 'mes_net.cov_net.16.bias', 'mes_net.cov_lin.0.weight', 'mes_net.cov_lin.0.bias'])

But the self is not the same as the network. ipdb> self TORCHIEKF( (initprocesscov_net): InitProcessCovNet( (factor_initial_covariance): Linear(in_features=1, out_features=6, bias=False) (factor_process_covariance): Linear(in_features=1, out_features=6, bias=False) (tanh): Tanh() ) (mes_net): MesNet( (tanh): Tanh() (cov_net): Sequential( (0): Conv1d(6, 32, kernel_size=(5,), stride=(1,)) (1): ReplicationPad1d((4, 4)) (2): ReLU() (3): Dropout(p=0.5) (4): Conv1d(32, 32, kernel_size=(5,), stride=(1,), dilation=(3,)) (5): ReplicationPad1d((4, 4)) (6): ReLU() (7): Dropout(p=0.5) ) (cov_lin): Sequential( (0): Linear(in_features=32, out_features=2, bias=True) (1): Tanh() ) ) ) ipdb

enguang2 commented 1 year ago

@ccsmm78 I would guess the URL points to the parameter file hasn't been updated and the parameter file actually has been altered since firstly published. Nevertheless, I am trying to train the model locally yet the training stops at early epoch(7 epoch), have you encountered same case? :)

Mieczmik commented 1 year ago

@enguang2 I have the same problem. I will give info about progress later.

Akudavale commented 8 months ago

hello EVERYONE,

II am facing same issuee of size mismatch how to resolve it?

TiloccaS commented 5 months ago

hi @Mieczmik @enguang2 I noticed that I have the same problem as yours, have you solved it somehow, or there are news?

luopengting commented 2 months ago

I have fixed the problem, please see #88 , hope it will help you. :D @Akudavale @TiloccaS