snwagh / falcon-public

Implementation of protocols in Falcon
89 stars 45 forks source link

Problems of AlexNet on ImageNet. #16

Closed kaih70 closed 3 years ago

kaih70 commented 3 years ago

I ran AlexNet on ImageNet in LAN. Here is what i got:

P0:

./Falcon.out 0 files/IP_LAN files/keyA files/keyAB files/keyAC AlexNet ImageNet Semi-honest
Loading data done.....
Forward     0 completed...
...
Forward     18 completed...
Delta last layer completed.
Delta       18 completed...
...
Delta       1 completed...
Update Eq.  18 completed...
...
Update Eq.  1 completed...
First layer update Eq. completed.
----------------------------------------------
Wall Clock time for AlexNet train: 7155.25 sec
CPU time for AlexNet train: 7042.8 sec
----------------------------------------------
----------------------------------------------
Total communication: 1777.36MB (sent) and 1777.36MB (recv)
Total calls: 2495 (sends) and 2495 (recvs)
----------------------------------------------
----------------------------------------------
Communication, AlexNet train, P0: 3171.3MB (sent) 3171.3MB (recv)
Rounds, AlexNet train, P0: 808(sends) 808(recvs)
----------------------------------------------
----------------------------------------------
Run details: 3PC (P0), 1 iterations, batch size 128
Running Semi-honest AlexNet train on ImageNet dataset
----------------------------------------------

----------------------------------------------
(1) CNN Layer         56 x 56 x 3
              7 x 7     (Filter Size)
              1 , 3     (Stride, padding)
              128       (Batch Size)
              56 x 56 x 64  (Output)
----------------------------------------------
(2) CNN Layer         56 x 56 x 64
              5 x 5     (Filter Size)
              1 , 2     (Stride, padding)
              128       (Batch Size)
              56 x 56 x 64  (Output)
----------------------------------------------
(3) Maxpool Layer     56 x 56 x 64
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(4) ReLU Layer        50176 x 128
----------------------------------------------
(5) BN Layer          50176 x 128
----------------------------------------------
(6) CNN Layer         28 x 28 x 64
              5 x 5     (Filter Size)
              1 , 2     (Stride, padding)
              128       (Batch Size)
              28 x 28 x 128     (Output)
----------------------------------------------
(7) Maxpool Layer     28 x 28 x 128
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(8) ReLU Layer        25088 x 128
----------------------------------------------
(9) BN Layer          25088 x 128
----------------------------------------------
(10) CNN Layer        14 x 14 x 128
              3 x 3     (Filter Size)
              1 , 1     (Stride, padding)
              128       (Batch Size)
              14 x 14 x 256     (Output)
----------------------------------------------
(11) CNN Layer        14 x 14 x 256
              3 x 3     (Filter Size)
              1 , 1     (Stride, padding)
              128       (Batch Size)
              14 x 14 x 256     (Output)
----------------------------------------------
(12) Maxpool Layer    14 x 14 x 256
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(13) ReLU Layer       12544 x 128
----------------------------------------------
(14) FC Layer         12544 x 1024
              128        (Batch Size)
----------------------------------------------
(15) ReLU Layer       1024 x 128
----------------------------------------------
(16) FC Layer         1024 x 1024
              128        (Batch Size)
----------------------------------------------
(17) ReLU Layer       1024 x 128
----------------------------------------------
(18) FC Layer         1024 x 200
              128        (Batch Size)
----------------------------------------------
(19) ReLU Layer       200 x 128
----------------------------------------------

P1:

./Falcon.out 1 files/IP_LAN files/keyB files/keyBC files/keyAB AlexNet ImageNet Semi-honest
Loading data done.....
Forward     0 completed...
...
Forward     18 completed...
Delta last layer completed.
Delta       18 completed...
...
Delta       1 completed...
Update Eq.  18 completed...
...
Update Eq.  1 completed...
First layer update Eq. completed.
----------------------------------------------
Wall Clock time for AlexNet train: 7155.26 sec
CPU time for AlexNet train: 6951.5 sec
----------------------------------------------
----------------------------------------------
Communication, AlexNet train, P1: 3171.3MB (sent) 3171.3MB (recv)
Rounds, AlexNet train, P1: 808(sends) 808(recvs)
----------------------------------------------
----------------------------------------------
Run details: 3PC (P1), 1 iterations, batch size 128
Running Semi-honest AlexNet train on ImageNet dataset
----------------------------------------------

----------------------------------------------
(1) CNN Layer         56 x 56 x 3
              7 x 7     (Filter Size)
              1 , 3     (Stride, padding)
              128       (Batch Size)
              56 x 56 x 64  (Output)
----------------------------------------------
(2) CNN Layer         56 x 56 x 64
              5 x 5     (Filter Size)
              1 , 2     (Stride, padding)
              128       (Batch Size)
              56 x 56 x 64  (Output)
----------------------------------------------
(3) Maxpool Layer     56 x 56 x 64
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(4) ReLU Layer        50176 x 128
----------------------------------------------
(5) BN Layer          50176 x 128
----------------------------------------------
(6) CNN Layer         28 x 28 x 64
              5 x 5     (Filter Size)
              1 , 2     (Stride, padding)
              128       (Batch Size)
              28 x 28 x 128     (Output)
----------------------------------------------
(7) Maxpool Layer     28 x 28 x 128
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(8) ReLU Layer        25088 x 128
----------------------------------------------
(9) BN Layer          25088 x 128
----------------------------------------------
(10) CNN Layer        14 x 14 x 128
              3 x 3     (Filter Size)
              1 , 1     (Stride, padding)
              128       (Batch Size)
              14 x 14 x 256     (Output)
----------------------------------------------
(11) CNN Layer        14 x 14 x 256
              3 x 3     (Filter Size)
              1 , 1     (Stride, padding)
              128       (Batch Size)
              14 x 14 x 256     (Output)
----------------------------------------------
(12) Maxpool Layer    14 x 14 x 256
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(13) ReLU Layer       12544 x 128
----------------------------------------------
(14) FC Layer         12544 x 1024
              128        (Batch Size)
----------------------------------------------
(15) ReLU Layer       1024 x 128
----------------------------------------------
(16) FC Layer         1024 x 1024
              128        (Batch Size)
----------------------------------------------
(17) ReLU Layer       1024 x 128
----------------------------------------------
(18) FC Layer         1024 x 200
              128        (Batch Size)
----------------------------------------------
(19) ReLU Layer       200 x 128
----------------------------------------------

P2:

./Falcon.out 2 files/IP_LAN files/keyC files/keyAC files/keyBC AlexNet ImageNet Semi-honest
Loading data done.....
Forward     0 completed...
...
Forward     18 completed...
Delta last layer completed.
Delta       18 completed...
...
Delta       1 completed...
Update Eq.  18 completed...
...
Update Eq.  1 completed...
First layer update Eq. completed.
----------------------------------------------
Wall Clock time for AlexNet train: 7155.25 sec
CPU time for AlexNet train: 6733.12 sec
----------------------------------------------
----------------------------------------------
Communication, AlexNet train, P2: 4024.69MB (sent) 4024.69MB (recv)
Rounds, AlexNet train, P2: 879(sends) 879(recvs)
----------------------------------------------
----------------------------------------------
Run details: 3PC (P2), 1 iterations, batch size 128
Running Semi-honest AlexNet train on ImageNet dataset
----------------------------------------------

----------------------------------------------
(1) CNN Layer         56 x 56 x 3
              7 x 7     (Filter Size)
              1 , 3     (Stride, padding)
              128       (Batch Size)
              56 x 56 x 64  (Output)
----------------------------------------------
(2) CNN Layer         56 x 56 x 64
              5 x 5     (Filter Size)
              1 , 2     (Stride, padding)
              128       (Batch Size)
              56 x 56 x 64  (Output)
----------------------------------------------
(3) Maxpool Layer     56 x 56 x 64
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(4) ReLU Layer        50176 x 128
----------------------------------------------
(5) BN Layer          50176 x 128
----------------------------------------------
(6) CNN Layer         28 x 28 x 64
              5 x 5     (Filter Size)
              1 , 2     (Stride, padding)
              128       (Batch Size)
              28 x 28 x 128     (Output)
----------------------------------------------
(7) Maxpool Layer     28 x 28 x 128
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(8) ReLU Layer        25088 x 128
----------------------------------------------
(9) BN Layer          25088 x 128
----------------------------------------------
(10) CNN Layer        14 x 14 x 128
              3 x 3     (Filter Size)
              1 , 1     (Stride, padding)
              128       (Batch Size)
              14 x 14 x 256     (Output)
----------------------------------------------
(11) CNN Layer        14 x 14 x 256
              3 x 3     (Filter Size)
              1 , 1     (Stride, padding)
              128       (Batch Size)
              14 x 14 x 256     (Output)
----------------------------------------------
(12) Maxpool Layer    14 x 14 x 256
              2         (Pooling Size)
              2         (Stride)
              128       (Batch Size)
----------------------------------------------
(13) ReLU Layer       12544 x 128
----------------------------------------------
(14) FC Layer         12544 x 1024
              128        (Batch Size)
----------------------------------------------
(15) ReLU Layer       1024 x 128
----------------------------------------------
(16) FC Layer         1024 x 1024
              128        (Batch Size)
----------------------------------------------
(17) ReLU Layer       1024 x 128
----------------------------------------------
(18) FC Layer         1024 x 200
              128        (Batch Size)
----------------------------------------------
(19) ReLU Layer       200 x 128
----------------------------------------------

Firstly, the "Total communication" seems not equal to the sum of all parties communication.

Secondly, it was much more time consuming than I expected.

Btw, I also ran AlexNet on CIFAR10, it cost 77.6502 sec and total communication was 665.514MB (sent).

Any clue?

snwagh commented 3 years ago

There does seem to be an issue with the total communication. The semi-honest protocol is asymmetric but I agree that doesn't account for all the discrepancy. Maybe look into the communication wrapper (particularly for parties 1, 2). It might also have to do with multi-threaded communication in some parts of the code.

About the time, the code is not well optimized so the CPU time is quite high when running it over Tiny ImageNet sizes. Is there a reason you suspected a lower number?

About AlexNet on CIFAR10, can you tell me what is the issue with the numbers?

kaih70 commented 3 years ago

Actually I am repeating experiments from CryptGPU https://arxiv.org/pdf/2104.10949.pdf. I got similar numbers as Table IV except AlexNet Tiny ImageNet.

And in other experiments I found total communication and the sum of communication of parties were equal.

snwagh commented 3 years ago

So CryptGPU's numbers for Falcon are lower than the actual numbers? I see a fine print in the caption of Table IV saying that Falcon numbers reported there are without batch norm layers (which would make sense why the numbers are lower).

About the communication, it is not clear why this is the case. It might be a bug that needs fixing.

kaih70 commented 3 years ago

I removed BN layers and got a lower number. Thank you!