facebookresearch / higher

higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.
Apache License 2.0
1.58k stars 123 forks source link

Does higher work with huggingface (hugging face, HF) models? e.g. ViT? #139

Open brando90 opened 1 year ago

brando90 commented 1 year ago

Current error:

Traceback (most recent call last):
  File "/lfs/ampere3/0/brando9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_maml_torchmeta.py", line 509, in <module>
    main()
  File "/lfs/ampere3/0/brando9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_maml_torchmeta.py", line 443, in main
    train(rank=-1, args=args)
  File "/lfs/ampere3/0/brando9/diversity-for-predictive-success-of-meta-learning/div_src/diversity_src/experiment_mains/main_maml_torchmeta.py", line 485, in train
    meta_train_fixed_iterations(args, args.agent, args.dataloaders, args.opt, args.scheduler)
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/training/meta_training.py", line 104, in meta_train_fixed_iterations
    log_zeroth_step(args, meta_learner)
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/logging_uu/wandb_logging/supervised_learning.py", line 170, in log_zeroth_step
    train_loss, train_acc = model(batch, training=training)
  File "/lfs/ampere3/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/meta_learners/maml_meta_learner.py", line 66, in forward
    meta_loss, meta_loss_ci, meta_acc, meta_acc_ci = meta_learner_forward_adapt_batch_of_tasks(self, spt_x, spt_y,
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/meta_learners/maml_differentiable_optimizer.py", line 473, in meta_learner_forward_adapt_batch_of_tasks
    meta_losses, meta_accs = get_lists_losses_accs_meta_learner_forward_adapt_batch_of_tasks(meta_learner,
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/meta_learners/maml_differentiable_optimizer.py", line 511, in get_lists_losses_accs_meta_learner_forward_adapt_batch_of_tasks
    fmodel: FuncModel = get_maml_adapted_model_with_higher_one_task(meta_learner.base_model,
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/meta_learners/maml_differentiable_optimizer.py", line 195, in get_maml_adapted_model_with_higher_one_task
    diffopt.step(inner_loss, grad_callback=lambda grads: [g.detach() for g in grads])
  File "/lfs/ampere3/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/higher/optim.py", line 237, in step
    all_grads = grad_callback(all_grads)
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/meta_learners/maml_differentiable_optimizer.py", line 195, in <lambda>
    diffopt.step(inner_loss, grad_callback=lambda grads: [g.detach() for g in grads])
  File "/afs/cs.stanford.edu/u/brando9/ultimate-utils/ultimate-utils-proj-src/uutils/torch_uu/meta_learners/maml_differentiable_optimizer.py", line 195, in <listcomp>
    diffopt.step(inner_loss, grad_callback=lambda grads: [g.detach() for g in grads])
AttributeError: 'NoneType' object has no attribute 'detach'
brando90 commented 1 year ago

https://github.com/facebookresearch/higher/issues/124

brando90 commented 1 year ago

https://stackoverflow.com/questions/75762721/does-higher-work-with-huggingface-hugging-face-hf-models-e-g-vit