Open solomonmanuelraj opened 8 months ago
@solomonmanuelraj We currently only support language models architectures like Llama (Yi, Deepseek, Qwen etc) and Mistral. So ViT based models no for now. However, modules can obviously be brought over to vision transformers, but this will be for a future project.
thanks for your quick response.
@solomonmanuelraj If you're not already on our Discord https://discord.gg/u54VK8m8tk, there are some other people who are also working on finetuning vision models :) Maybe you all can discuss about vision models :))
Hi team,
Like to fine tune the vision foundation models like Owl-vit which is mainly used for zero shot object detection.
Like to know whether unsloth supports this VFM Lora Fine tuning? your reference will be much help.
thanks