FP16 SSD implementation on TX1

weiliu89 / caffe

Caffe: a fast open framework for deep learning.

http://caffe.berkeleyvision.org/

Other

4.77k stars 1.67k forks source link

FP16 SSD implementation on TX1 #206

Open AlexandreBriot opened 8 years ago

AlexandreBriot commented 8 years ago

I want to evaluate the inference time difference on TX1 with fp16 and fp32. Many layers used in SSD are not implemented on nvcaffe fp16experimental branch. Has anyone tried to make SSD forward pass run on TX1 with fp16 ?

aurotripathy commented 8 years ago

I am also interested in the forward mode. Can you highlight what layers in SSD are not implemented in the nvcaffe fp16experimental branch.

AlexandreBriot commented 8 years ago

Many layers are not supported by nvcaffe fp16. Some are supported by official caffe like dilated convolution, some others are weiliu SSD branch specific layers like prior_box_layer.

aurotripathy commented 8 years ago

@AlexandreBriot, Thanks

wangsssky commented 8 years ago

Is there some documents there about how to change the official caffe files to a fp16 version? Thanks

AlexandreBriot commented 8 years ago

As far as I know, fp16 is not implemented in official BVLC caffe. There is an experimental fp16 version of nvcaffe here https://github.com/NVIDIA/caffe/tree/experimental/fp16 but all layers are not currently supported so it will not work with ssd for instance

loulansuiye commented 7 years ago

@AlexandreBriot @maybepossible @aurotripathy Do you have implemented fp16 in full function