How to use functions in gate

Journery commented 2 years ago

Could you offer a demo which shows how the functions in gate.py to use? It doesn't look like it's been used in hydranet.py Thanks.

malawada commented 2 years ago

Hello, thanks for your question. We trained and evaluated the gate model and the main hydranet model separately as described in the paper. If you want to integrate the gate model with the hydranet to run both of them together, you can add the gate as a module in the hydranet. Then, in the forward pass you can concatenate all of the stem model outputs (radar_output, l_camera_output, r_camera_output, lidar_output) along the channel dimension at line 142 in hydranet.py and pass this tensor as the input to the gate model. The output of the gate model can then be assigned to the variable branch_selection in hydranet.py and used from line 143 to select which branches to execute.

rashrosha commented 2 years ago

could you please show me where we can pass the dataset to the model ? Thanks

malawada commented 2 years ago

Hello, we used the radiate sdk https://github.com/marcelsheeny/radiate_sdk to load the dataset and did some preprocessing/reshaping to align the different sensor inputs to the resnet-18 input shape. Unfortunately, we cannot provide our full code at this time due to internal obligations, but the model can be trained with the dataset using typical pytorch conventions.

Vincent-ch99 commented 2 years ago

Hello, I added the gate module to my main network as a branch selector, and then used the selected result to guide the execution of the subsequent branch network. There is no problem in the forward propagation of the network, and the data can be calculated normally, but with the calculation of loss and the optimization of the back-propagation gradient of the network, it will prompt that the gate network has not received the corresponding loss, so that the gradient update cannot be performed in the gate module, also the weight of the gate module does not change,like:“No inf checks were recorded for this optimizer.“ Have you encountered similar problems, or if it is convenient for you, could u please tell me how to make the gate module receive the stem network value, and guide the branch module to make subsequent selections, and at the same time ensure that the gate network can update the weights, because the gate network seems to only use the final detection result value to calculate the loss ,then update the overall backpropagation of gate network. Or could you tell me how the loss and gradient updates are designed? I would be very grateful if you could answer me.

malawada commented 2 years ago

As mentioned in our paper, we trained the gate model independently from the main hydrafusion model. We first trained the hydrafusion model with all branches enabled on the dataset. Once the branches and stems are fully trained, we pass the dataset through the model again and collect the output of each stem as the input and the loss of each branch as the target. This input and target are then used to train the gate using supervised learning.

This is what we mean when we say the gate is trained independently. Then, to perform evaluation, we combine this trained gate back with the original model to perform inference.

We didn't try to train the gate at the same time as the rest of the hydrafusion model due to the complexities involved with training a dynamic architecture model. This is probably the source of your issues. Unfortunately, we can't provide much advice on how to address issues related to simultaneous training, but if you train the gate separately as we describe then you shouldn't have issues. Hope this helps!

AICPS / hydrafusion

How to use functions in gate #2