Open shauloron opened 3 months ago
Can you obtain correct results (metrics and visualizations) by running inference with the released checkpoint?
I can't seem to find any pretrained model, only pretrained resnet50 weights. Can you please point me to the pretrained weights you released? Thanks
I've tried using the checkpoint above and get different results than the ones in the tutorial notebook visualizations To start the anchor map doesn't look the same: and the detection results are partial:
any ideas?
Hi @linxuewu I would really appreciate any idea you may have on this one?
Hi, I'm trying to train the model with the provided config (R50 256x704) with code pulled on July 10, 2024. I'm using 4 A100 GPU's with total batch size 48.
With the original LR 6e-4 the training diverges and grad_norm goes NaN after ~20 epochs. When I lower the LR to 4e-4 the loss goes down and grad_norm is ok but the final model has AP=0 for all classes after 100 epochs. I read through all the open and closed issue. Checked that the resnet50 pretrain is loading successfully. using the default aug_config: data_aug_conf = { "resize_lim": (0.40, 0.47), "final_dim": input_shape[::-1], "bot_pct_lim": (0.0, 0.0), "rot_lim": (-5.4, 5.4), "H": 900, "W": 1600, "rand_flip": True, "rot3d_range": [-0.3925, 0.3925], }
Any ideas why my training doesn't converge?