Zeitzmz / nips-micronet

MIT License
4 stars 1 forks source link

Official Review #1

Open micronet-challenge-submissions opened 5 years ago

micronet-challenge-submissions commented 5 years ago

Hello! Thanks so much for your entry!

I'm having some difficulty building your entry. What is the exact environment you're using?

Trevor

micronet-challenge-submissions commented 5 years ago

We've spent a good amount of time trying to build your code on a couple different environments without success. Would it be possible for you to provide a dockerfile that we can use?

hujie-frank commented 5 years ago

@micronet-challenge-submissions

Sorry for the inconvenience, we have updated the Dockerfile and elaborated in ReadMe.

Best regards, Jie

micronet-challenge-submissions commented 5 years ago

Thanks for the updates! I've been able to successfully build your entry and evaluated it on a V100 but I got a top-1 accuracy of 74.98%. What GPU are you evaluating on?

Also, I have the following questions about your scoring:

  1. For your "HardSwish" operation, you're counting it as 3 operations in the precision of the input data, but it should be 5:

https://github.com/Zeitzmz/micronet-caffe/blob/70ef2c8e2fa8b69a61ced21828a7afd374ca1c62/tools/scoring.cpp#L206

y0 = x + 3 - 1 operation y1 = max(y0, 0) - 1 operation y2 = min(y1, 6) - 1 operation y3 = x * y2 - 1 operation y4 = y3 / 6 - 1 operation

  1. For average pooling, you're inserting a "fake quantization" layer prior but is the layer Dtype set to half? If it is set to float, then these operations should be counted as FP32 since the repeated additions in the kernel are all FP32 additions.

  2. For Sigmoid & ReLU and HardSwish, it doesn't look like you're rounding the inputs but you are counting these operations as being performed in reduced precision. Unless I'm missing something, these should be counted as full 32-bit operations.

  3. Could you point me to where you set the accumulation type to 16-bit for the convolutional layers? I see you are setting engine=CAFFE, but I'd like to know exactly what kernel is being executed. If you could point me at the pooling kernel that's being run that also would be helpful.

Thanks! Trevor

hujie-frank commented 5 years ago

@micronet-challenge-submissions

We evaluated our model on a GeForce 1080 and a TitanV in the docker and both achieve 75.0762% top-1 accuracy.

Thank you for your carefully examine. I'm very sorry that there are several mistakes in scoring.

  1. For "Hardswish" operation, we mistook the number of operations and accepted that 5 operations is correct.

  2. For average pooling, it should really be counted as full 32-bit operations.

  3. 1) For ReLU activation which is linear while axis >= 0, it can be fused with the linear operation (e.g. convolution). Symmetric uniform quantization before it and after it will get the same feature responses in practice. So it is reasonable to count ReLU with reduced precision which we have discussed in forum. 2) For Sigmoid and HardSwish, these are not linear functions. It is different while a uniform quantization before it and after it. In our setting, these really should be counted as 32-bits operations.

  4. we implement 16-bit convolution in base_conv_layer with 'fp16setup' flag based on original im2col + GEMM. 1) Line 278-307: Malloc memory for the temporary buffer in __half type. 2) Line 397-405: Divid input fp32 activations (after fake quantization) by corresponding quantization step to integer representation (real quantization) and stored in fp16 vector. 3) Line 23-33: Adopt real quantization (interger weights) on convolutional weights if set fp16_accumulation. 4) Line 408-412: Execute GEMM in fp16 mode with cublasGemmEx whose input type and compute type are sepcified in CUDA_R_16F type in math function. 5) Line 421-435: Scale GEMM output with corresponding input quantization step and weight quantization step to recover fake quantization status.

Finally, we have correct these mistakes in the scoring script and the self-reported score is 0.188706 now.

Best regards, Jie

micronet-challenge-submissions commented 5 years ago

Thanks for the fixes! I evaluated your model on a P100 and got 75.0762% as well.

Everything about your entry checks out now! Would you mind updating your first submission as well? Thanks!

Trevor

micronet-challenge-submissions commented 5 years ago

It looks like it may work as is. I checked out the submission-1 branch and left the submodule the same and found a score of .205348. Could you please verify that this is correct? What are the differences between these two entries?

Thanks, Trevor

micronet-challenge-submissions commented 5 years ago

However, when I try to evaluate the older branch with the newer caffe version I see top-1 accuracy of 0% for the first 20 steps, so it does appear that something is no longer supported in the code.

micronet-challenge-submissions commented 5 years ago

Ok, I updated the submodule for eval and was able to repro you reported accuracy. If you can confirm that this score matches what you see with your updated scoring script we should be all set verifying your entry! Thanks!

Trevor

hujie-frank commented 5 years ago

In the submission-1 brach, both weights and activations are quantized to 8-bits and the implementation of fp16_accumulation is not reasonable since we just take the fake quantization values (float-pointing values which are significantly smaller than real values) which have scaled by quantization step and than conduct fp16_GEMM. So under this condition, all the arithmetic operations in the network should be regarded as 16-bits, but the storage of weights is still considered in 8-bits.

While in the main branch, activations are represented in 8-bits and weights are represented in 6-bits. Besides, we correct the fp16_accumulation implementation using real quantization results (integer values as GEMM input) to perform GEMM in FP16 mode.

Jie

micronet-challenge-submissions commented 5 years ago

Ah I see, so is the calculate score for submission-1 branch correct?

Trevor

hujie-frank commented 5 years ago

Yes, it is correct!

Jie

micronet-challenge-submissions commented 5 years ago

Ok great, thanks!

On Thu, 31 Oct 2019 at 10:14, Jie Hu notifications@github.com wrote:

Yes, it is correct!

Jie

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Zeitzmz/nips-micronet/issues/1?email_source=notifications&email_token=AMILA64WUHIAQ22DNTPUDELQRMHATA5CNFSM4JE4RW2KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECYR43I#issuecomment-548478573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMILA625PS4INUX3LCNRTXDQRMHATANCNFSM4JE4RW2A .

micronet-challenge-submissions commented 4 years ago

Hello,

Could you share the separate FLOP & parameter counts for your two entries? We unfortunately turned down our environment for evaluating your entry already and did not record them separately.

Thanks, Trevor

micronet-challenge-submissions commented 4 years ago

Ping. If you could share this information with us it would be greatly appreciated:)

Trevor

hujie-frank commented 4 years ago

The separate FLOP & parameter counts are logged on the last of the score.log and also recorded on the last page of the submission.pdf.

micronet-challenge-submissions commented 4 years ago

Ah excellent, thanks! Would you mind updating the score.log https://github.com/Zeitzmz/nips-micronet/blob/submission-1/submission/log/score.log for submission_1 as well?

Trevor

On Tue, 3 Dec 2019 at 10:27, Jie Hu notifications@github.com wrote:

The separate FLOP & parameter counts are logged on the last of the score.log https://github.com/Zeitzmz/nips-micronet/blob/master/submission/log/score.log and also recorded on the last page of the submission.pdf https://github.com/Zeitzmz/nips-micronet/blob/master/submission.pdf.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Zeitzmz/nips-micronet/issues/1?email_source=notifications&email_token=AMILA6ZE5CH6NAW7BRLBFTTQWZ3GDA5CNFSM4JE4RW2KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFZYF4Y#issuecomment-561218291, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMILA6ZNA7PPKL4UR3KCWX3QWZ3GDANCNFSM4JE4RW2A .

micronet-challenge-submissions commented 4 years ago

Ping. Let us know if you're not able to do this and we can re-setup our environment. Thanks!

Trevor

On Tue, 3 Dec 2019 at 11:26, MicroNet Challenge < micronet.challenge@gmail.com> wrote:

Ah excellent, thanks! Would you mind updating the score.log https://github.com/Zeitzmz/nips-micronet/blob/submission-1/submission/log/score.log for submission_1 as well?

Trevor

On Tue, 3 Dec 2019 at 10:27, Jie Hu notifications@github.com wrote:

The separate FLOP & parameter counts are logged on the last of the score.log https://github.com/Zeitzmz/nips-micronet/blob/master/submission/log/score.log and also recorded on the last page of the submission.pdf https://github.com/Zeitzmz/nips-micronet/blob/master/submission.pdf.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Zeitzmz/nips-micronet/issues/1?email_source=notifications&email_token=AMILA6ZE5CH6NAW7BRLBFTTQWZ3GDA5CNFSM4JE4RW2KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFZYF4Y#issuecomment-561218291, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMILA6ZNA7PPKL4UR3KCWX3QWZ3GDANCNFSM4JE4RW2A .

micronet-challenge-submissions commented 4 years ago

We were able to retrieve the log from an old VM. Thanks for your help!

Trevor