Open HideLakitu opened 2 weeks ago
Hi,
Yes. You should follow the instructions in the README.md. Generally speaking, the "resumed_model" parameter provides you a checkpoint, where you can quickly start poisoning regardless of the previous training. You can simply run a program without poisoning first, and save any checkpoint you want to start with. Then replace the "resumed_model" with the one you saved to start poisoning.
Hi,
Yes. You should follow the instructions in the README.md. Generally speaking, the "resumed_model" parameter provides you a checkpoint, where you can quickly start poisoning regardless of the previous training. You can simply run a program without poisoning first, and save any checkpoint you want to start with. Then replace the "resumed_model" with the one you saved to start poisoning.
THX for reply, just set the resumed_model to False
in file to run then it worked.
BTW, I checked corresponding .yaml file and found the total training round is 1900 ( # server training setting start_round: 1 end_round: 1900,
) correct? so do u have impression about how long u take for running before --- under TPU-G4 on Google, about nearly 3 hours by me to only complete less than 350 rounds, feel a little time consuming for just reach the limitation of computing resource within a day, I didn't pay for that.
Yes, and you can modify the end_round in .yaml to different configurations. For the needed time, I think it is reasonable for you to run less than 350 rounds in 3 hs.
well I use the command
!python main.py --GPU_id "0" --params utils/yamls/indicator/params_vanilla_Indicator.yaml
under T4 GPU try to run, then it said: _FileNotFoundError: [Errno 2] No such file or directory: 'saved_models/saved_model_global_model1200.pt.tar'But isn't this .tar file automatically generated during training? or I should edit
resumed_model: "Jun.05_06.09.03/saved_model_global_model_1200.pt.tar"
to something else which u meationed in redame file.