Improve the competition evaluation process

Although I know it is too late to share this PR, it is still important to inform the defects in current system.

The learned policy is re-loaded everytime when the obstacle callback func is called. I don't know the reason of such implementation, but it really takes a lot of time to deploy the trained model every time. So I refactor it to laod the policy only once.
I increase the wait time for hovering afte takeoff, I think this is also related to the deploy the trained model. A shorter wait time would make the drone not moving.

uzh-rpg / agile_flight