Open jduki opened 4 years ago
As I read paper, efficient d0 size should be 3.8MB. Why d0 checkpoint weight included on git has 40MB size ??
I guess 3.8M, which is 3.9M in the paper, is the number of parameters(see table 2...#Params),not the extact store space in the disk.
As I read paper, efficient d0 size should be 3.8MB. Why d0 checkpoint weight included on git has 40MB size ??
I guess 3.8M, which is 3.9M in the paper, is the number of parameters(see table 2...#Params),not the extact store space in the disk.
Then, shouldn't it be around 16MB since the weights are in float32?
As I read paper, efficient d0 size should be 3.8MB. Why d0 checkpoint weight included on git has 40MB size ??
I guess 3.8M, which is 3.9M in the paper, is the number of parameters(see table 2...#Params),not the extact store space in the disk.
Then, shouldn't it be around 16MB since the weights are in float32?
yes,u r right. the weight file is a orderdict include both weights and bias and other param names (orderdict keys)and their values(around 3.9m ).
It is 40M because it stores the optimizer state during training in the weight file itself. Store only state dict to disk and you will get the exact weight.
As I read paper, efficient d0 size should be 3.8MB. Why d0 checkpoint weight included on git has 40MB size ??
3.8M明显是参数的数量
As I read paper, efficient d0 size should be 3.8MB. Why d0 checkpoint weight included on git has 40MB size ??