Closed dinvincible98 closed 1 year ago
I recommend that you try using fp16 setting . Based on your logs, it appears that you are using a single GPU and a batch size of 1. The drop in performance could be caused by two potential reasons:
Hi,
I trained a DSVT model using the entire waymo dataset but with a smaller batch_size = 1 and lr_rate = 0.001. The performance of each category is 2-3% below your benchmark. Is it due to the smaller batch size and lr_rate? I attached my log below: log_train_20230515-135943.txt
I check your log and find that your total batch size = 1, so the sync_bn is totally no use. I think at least make sure the total batch size > 8.
Hi,
I trained a DSVT model using the entire waymo dataset but with a smaller batch_size = 1 and lr_rate = 0.001. The performance of each category is 2-3% below your benchmark. Is it due to the smaller batch size and lr_rate? I attached my log below: log_train_20230515-135943.txt
I find your log time is 16 days. If you are limited by resources, you can start by training 20% 12e setting.
Hi,
I trained a DSVT model using the entire waymo dataset but with a smaller batch_size = 1 and lr_rate = 0.001. The performance of each category is 2-3% below your benchmark. Is it due to the smaller batch size and lr_rate? I attached my log below: log_train_20230515-135943.txt