facebookresearch / CrypTen

A framework for Privacy Preserving Machine Learning
MIT License
1.52k stars 278 forks source link

Running on AWS with TTP #506

Open jimouris opened 4 months ago

jimouris commented 4 months ago

I'm using the aws_launcher.py script to run functional benchmarks in AWS. I've adapted the benchmarks.py file to be similar as the (mpc_cifar](https://github.com/facebookresearch/CrypTen/tree/main/examples/mpc_cifar) example which can be used with the aws_launcher.py.

  1. First, when I'm using a TTP vs a TFP I'm getting different errors. The TTP ones are significantly higher.
  2. I've also observed that there are four different ways to run mpc_cifar with the aws_launcher.py: 1. TFP and no multiprocessing flag: This runs fine. 2. TFP and multiprocessing flag: This runs but it prints everything twice. 3. TTP and multiprocessing flag: Same, this runs but it prints everything twice. 4. TTP and no multiprocessing flag: This does not run, it stucks at:
    Run command: export WORLD_SIZE=2; export RENDEZVOUS=env://; export MASTER_ADDR=172.31.9.57; export MASTER_PORT=29500; export RANK=0; cd aws-launcher-tmp-21096b1c-16ab-11ef-9c02-767dd5f880da ;  ./launcher.py --tensor_size 10,10
    Run command: export WORLD_SIZE=2; export RENDEZVOUS=env://; export MASTER_ADDR=172.31.9.57; export MASTER_PORT=29500; export RANK=1; cd aws-launcher-tmp-21096b1c-16ab-11ef-9c02-767dd5f880da ;  ./launcher.py --tensor_size 10,10
    [i-0c003b1628efc5567 STDOUT] INFO:root:Using LUTs Config:
    [i-0c003b1628efc5567 STDOUT] INFO:root:Tensor size '(10, 10)'
    [i-0c003b1628efc5567 STDOUT] INFO:root:==================
    [i-0c003b1628efc5567 STDOUT] INFO:root:DistributedCommunicator with rank 0
    [i-0c003b1628efc5567 STDOUT] INFO:root:==================
    ^C