autodistill / autodistill

Images to inference with no labeling (use foundation models to train supervised models).
https://docs.autodistill.com
Apache License 2.0
1.73k stars 135 forks source link

Support for finetuning the foundation model before distilling #83

Open samedii opened 8 months ago

samedii commented 8 months ago

Search before asking

Description

I couldn't find any information on support for finetuning the foundation model before distilling. Sorry if I missed it!

I think this is an extremely important feature since it can really help in cases where the foundation model performs very badly unless it gets to see a hundred or so examples of the unseen domain.

It will also allow the user to iterate on improving the foundation model with corrected data and gradually distilling a better and better small model.

Use case

E.g. I have strange-looking images from point cloud renders that are close to what the foundation model should be able to handle but the segmentations are bad enough that it's pointless to distill a smaller model until the foundation model gives better results.

Additional

I will try and see if I can do this manually by getting grads through an inference interface.

Are you willing to submit a PR?

summelon commented 8 months ago

+1. Unseen domain + hierarchical objects are a big challenge for current foundation model , e.g., SAM, DINO etc. AFAIF, not only in this repo, the finetuning for such tasks are not well-studied for now.

capjamesg commented 8 months ago

Thank you for filing this Issue! We have not yet thought about fine-tuning foundation models as part of autodistill. I have taken a note of this idea and will consider how we can look at fine-tuning models in the future.

samedii commented 8 months ago

We can of course finetune models in our own codebases too if you think this is outside the intended scope

I recommend having a look at PEFT if you haven't seen it https://github.com/huggingface/peft :) It can be used as a utility library for lightweight finetuning

hagonata commented 2 months ago

+1. Unseen domain + hierarchical objects are a big challenge for current foundation model , e.g., SAM, DINO etc. AFAIF, not only in this repo, the finetuning for such tasks are not well-studied for now.

at least for now, you can fine-tune DINO here https://github.com/open-mmlab/mmdetection/blob/dev-3.x/configs/grounding_dino/README.md