mit-han-lab / offsite-tuning

Offsite-Tuning: Transfer Learning without Full Model
https://arxiv.org/abs/2302.04870
MIT License
365 stars 37 forks source link

NotImplementedError #6

Open vis-face opened 1 year ago

vis-face commented 1 year ago

offsite_tuning/run_image_classification.py

def to_teacher(model, args): l = args.student_l_pad print(type(model.model)) if isinstance(model, OPTForCausalLM): r = len(model.model.decoder.layers) - args.student_r_pad model.model.decoder.layers = model.model.decoder.layers[ :l] + model.teacher + model.model.decoder.layers[r:] elif isinstance(model, GPT2LMHeadModel): r = len(model.transformer.h) - args.student_r_pad model.transformer.h = model.transformer.h[:l] + \ model.teacher + model.transformer.h[r:] elif isinstance(model, BloomForCausalLM): r = len(model.transformer.h) - args.student_r_pad model.transformer.h = model.transformer.h[:l] + \ model.teacher + model.transformer.h[r:] elif isinstance(model, ViTForImageClassification): r = len(model.vit.encoder.layer) - args.student_r_pad model.vit.encoder.layer = model.vit.encoder.layer[:l] + \ model.teacher + model.vit.encoder.layer[r:] elif isinstance(model, CLIPViTForImageClassification): r = len(model.vit.encoder.layers) - args.student_r_pad model.vit.encoder.layers = model.vit.encoder.layers[:l] + \ model.teacher + model.vit.encoder.layers[r:] elif isinstance(model, EVAViTForImageClassification): r = len(model.blocks) - args.student_r_pad model.blocks = model.blocks[:l] + \ model.teacher + model.blocks[r:] else: raise NotImplementedError

<class 'torch.nn.parallel.distributed.DistributedDataParallel'> Traceback (most recent call last): File "offsite_tuning/run_image_classification.py", line 564, in main() File "offsite_tuning/run_image_classification.py", line 413, in main model = to_teacher(model, args) File "/root/paddlejob/workspace/env_run/offsite-tuning-main/offsite_tuning/utils.py", line 714, in to_teacher raise NotImplementedError NotImplementedError

bf-yang commented 1 year ago

I meet the same problem...

08/14/2023 19:46:09 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 64 08/14/2023 19:46:09 - INFO - main - Gradient Accumulation steps = 1 08/14/2023 19:46:09 - INFO - main - Total optimization steps = 32 Traceback (most recent call last): File "/home/bufang/offsite-tuning/offsite_tuning/run_image_classification.py", line 564, in main() File "/home/bufang/offsite-tuning/offsite_tuning/run_image_classification.py", line 411, in main model = to_student(model, args) ^^^^^^^^^^^^^^^^^^^^^^^ File "/home/bufang/offsite-tuning/offsite_tuning/utils.py", line 743, in to_student raise NotImplementedError NotImplementedError