DerrickXuNu / OpenCOOD

[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.
https://mobility-lab.seas.ucla.edu/opv2v/
Other
663 stars 99 forks source link

Questions about the shrink_header? #79

Closed lubin202209 closed 1 year ago

lubin202209 commented 1 year ago

Hello, I have a question about the shrink_header. Could you please tell me what's the meaning of shrink_header here as I can use the compression to compress the features? d94a739c881517ff69c789129c457ba 273060e3c248facf250ba006820773d 0ca5c52738ce0a8cbd0ed9c1bd67f51

DerrickXuNu commented 1 year ago

It is mainly used to shrink the spatial resolution to save GPU computation cost and training time.

lubin202209 commented 1 year ago

Because I don't want to use shrink_head or compress, I set the flags of both to false. At this time, I want to train the pointpillar_v2xvit model. I set the feature_stride in the yaml file postprocess to 2 like below. bf40801007bebd04e4b24a29d05a20b How can I set the parameters in the yaml file transformer so that the size of the tensor shape can match? d75fd8ec2fd4d5b6ec72c2fc369fc76

DerrickXuNu commented 1 year ago

You can still use shrink header but change the stride from 2 to 1, so it will only modify the channel for the ViT to directly use and won't have big impact on the performance . If you insist to remove the whole head, then you need to adjust the channel number in ViT

lubin202209 commented 1 year ago

Yes, I insist to remove the whole head, I change the config "dim" in the cav_att_config and the config "dim" in the pwindow_att_config and the config "mlp_dim" in the feed_forward from the original 256 to 384 like below, however, related errors about tensor size mismatch will still be reported during the training process, so could you please explain in more detail how to adjust the channel number? 4ea6749bf3d16934f6307bbe1f13860 2d2ab3cc93ed35567b7b02870fcfe83

DerrickXuNu commented 1 year ago

Your heads*dim_head should be 384 as well