tensorboard in feedforward scheduler and fixed mu (trainer_basic)

smilesun commented 1 year ago

we want to compare against feedfoward mu controller (exponential warm up) and fixed mu for ell () + mu*R(), see YAML file below:

https://github.com/marrlab/DomainLab/blob/fbopt/examples/benchmark/benchmark_fbopt_mnist_jigen.yaml

it is interesting to know how they behave in tensorboard regarding ell loss and R loss

smilesun commented 1 year ago

To reuse the tensorboard code we only need to change this line 85 which is self.set_scheduler(scheduler=HyperSchedulerFeedback) From file domainlab/algos/trainers_train_fbopt_b.py

And replace the HyperSchedulerFeedback to HyperSchedulerWarmup or HyperSchedulerWarmupExponetial

The above two classes can be imported from domainlab/algos/trainers/hyper_scheduler.py

smilesun commented 1 year ago

https://github.com/marrlab/DomainLab/pull/624/files

@agisga , could you compare the phase portrait of the two settings?

smilesun commented 1 year ago

sh run_fbopt_mnist_feedforward.sh

@agisga it is not working yet, i will see if i could fix it.

File "/home/playtime/domainlab/main_out.py", line 17, in exp.execute() File "/home/playtime/domainlab/domainlab/compos/exp/exp_main.py", line 68, in execute self.trainer.before_tr() File "/home/playtime/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 91, in before_tr self.set_model_with_mu() # very small value File "/home/playtime/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 119, in set_model_with_mu self.model.hyper_update(epoch=None, fun_scheduler=HyperSetter(self.hyper_scheduler.mmu)) AttributeError: 'HyperSchedulerWarmup' object has no attribute 'mmu'

smilesun commented 1 year ago

https://github.com/marrlab/DomainLab/pull/626/files

smilesun commented 1 year ago

h run_fbopt_mnist_feedforward.sh 

no algorithm conf specified, going to use default

/home/sunxd/domainlab/domainlab/arg_parser.py:252: UserWarning: no algorithm conf specified, going to use default
  warnings.warn("no algorithm conf specified, going to use default")

using device: cuda

/home/sunxd/anaconda3/lib/python3.9/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
  warnings.warn(warning.format(ret))
/home/sunxd/anaconda3/lib/python3.9/site-packages/torchmetrics/utilities/prints.py:36: UserWarning: Metric `AUROC` will save all targets and predictions in buffer. For large datasets this may lead to large memory footprint.
  warnings.warn(*args, **kwargs)
b'cdf0c565'
model name: mnistcolor10_te_rgb_31_119_180_jigen_bcdf0c565_2023md_11md_07_15_43_03_seed_0

 Experiment start at: 2023-11-07 15:43:03.725952
Traceback (most recent call last):
  File "/home/sunxd/domainlab/main_out.py", line 17, in <module>
    exp.execute()
  File "/home/sunxd/domainlab/domainlab/compos/exp/exp_main.py", line 68, in execute
    self.trainer.before_tr()
  File "/home/sunxd/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 92, in before_tr
    self.set_model_with_mu()  # very small value
  File "/home/sunxd/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 121, in set_model_with_mu
    self.model.hyper_update(epoch=None, fun_scheduler=self.hyper_scheduler)
  File "/home/sunxd/domainlab/domainlab/models/model_dann.py", line 70, in hyper_update
    dict_rst = fun_scheduler(epoch)  # the __call__ method of hyperparameter scheduler
  File "/home/sunxd/domainlab/domainlab/algos/trainers/hyper_scheduler.py", line 39, in __call__
    dict_rst[key] = self.warmup(val_setpoint, epoch)
  File "/home/sunxd/domainlab/domainlab/algos/trainers/hyper_scheduler.py", line 31, in warmup
    ratio = ((epoch+1) * 1.) / self.total_steps
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

smilesun commented 1 year ago

 Experiment start at: 2023-11-07 15:49:52.197946
Traceback (most recent call last):
  File "/home/sunxd/domainlab/main_out.py", line 17, in <module>
    exp.execute()
  File "/home/sunxd/domainlab/domainlab/compos/exp/exp_main.py", line 68, in execute
    self.trainer.before_tr()
  File "/home/sunxd/domainlab/domainlab/algos/trainers/train_fbopt_b.py", line 99, in before_tr
    self.hyper_scheduler.set_setpoint(
AttributeError: 'HyperSchedulerWarmup' object has no attribute 'set_setpoint'

smilesun commented 1 year ago

pr: https://github.com/marrlab/DomainLab/pull/626

marrlab / DomainLab

tensorboard in feedforward scheduler and fixed mu (trainer_basic) #507