why cost 2500+ MB when training but the model only 5MB big?

forresti / SqueezeNet

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters

BSD 2-Clause "Simplified" License

2.17k stars 723 forks source link

Closed aceimnorstuvwxz closed 7 years ago

aceimnorstuvwxz commented 7 years ago

Is much of them is the training image data? thank you!

aceimnorstuvwxz commented 7 years ago

I mean in GPU

austingg commented 7 years ago

the model only saves parameters, lots of memory cost on intermediate feature maps (i.e. convolution results) when training