How to continue fine-tuning the SeeClick with Lora on Mind2WEb/AITW data?

Thanks for the great work! @njucckevin

I tried reproducing SeeClick's performances on AITW and Mind2WEb but encountered a problem.

After finetuing Qwen-VL with the 1M data mentioned in your paper, I got a LoRA checkpoint. Now I want to finetune this model with the LoRA checkpoint on the downstream Mind2Web training data.

When I set --model_name_or_path as the LoRA checkpoint folder named "checkpoint-5200", the finetuning program raised:

OSError: /data/reproduce_seeclick/checkpoint-5200 does not appear to have a file named config.json. Checkout 'https://huggingface.co//data/reproduce_seeclick/checkpoint-5200/tree/None' for available files.

I also tried to merge the LoRA with the Qwen-VL model and used this model as --model_name_or_path, but the finetuning program raised warnings: `Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:11<00:00, 1.16s/it] transformer.h.0.attn.c_attn not satisfy lora

transformer.h.0.attn.c_attn not satisfy lora

transformer.h.0.attn.c_attn not satisfy lora`

Could you please clarify further what this mean

pretrain-ckpt: base model for fine-tuning, e.g. SeeClick-pretrain or Qwen-VL

in here?

njucckevin / SeeClick

How to continue fine-tuning the SeeClick with Lora on Mind2WEb/AITW data? #35