ucbrise / actnn

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
MIT License
196 stars 30 forks source link

what is pipeline_threshold used for? #8

Closed Jack47 closed 3 years ago

Jack47 commented 3 years ago

I want to figure out what does pipeline and pipeline_threshold means in actnn. Din't find examples in test or readme, so may you guys give some examples or explain it a bit? Thanks image

PS: I'm currently reading actnn source code and learned a lot from it , my chinese notes here.

merrymercy commented 3 years ago

For large tensors, we compute the forward and quantize activations one micro-batch by one micro-batch, so we call this "pipeline". This is used to reduce the temporary workspace memory and reduce memory fragmentation.

If the tensor size is larger than pipeline_threshold, we will apply this optimization.

Jack47 commented 3 years ago

Sounds interesting, thanks for clear explain, it helps a lot.