pharmapsychotic / clip-interrogator

Image to prompt with BLIP and CLIP
MIT License
2.71k stars 431 forks source link

Downloads BLIP Checkpoint every time #96

Open Jeffman112 opened 1 year ago

Jeffman112 commented 1 year ago

image right now, every time I run my script, it loads the BLIP checkpoint from the link (which takes a lot of vram and crashes my hosting). How can I make it load from a file? Also I'm assuming that the CLIP model is loading from the /cache folder? (download_cache is set to false)

Eaven21 commented 1 year ago
BLIP_MODELS = {
    'base': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth',
    'large': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth'
}

class Interrogator():
    def __init__(self, config: Config):
        self.config = config
        self.device = config.device
        self.blip_offloaded = True
        self.clip_offloaded = True

        if config.blip_model is None:
            if not config.quiet:
                print("Loading BLIP model...")
            blip_path = os.path.dirname(inspect.getfile(blip_decoder))
            configs_path = os.path.join(os.path.dirname(blip_path), 'configs')
            med_config = os.path.join(configs_path, 'med_config.json')
            blip_model = blip_decoder(
                pretrained=BLIP_MODELS[config.blip_model_type],
                image_size=config.blip_image_eval_size, 
                vit=config.blip_model_type, 
                med_config=med_config
            )

In the clip_interrogator.py file, the initialization path for the BLIP model is fixed, which means that the model needs to be redownloaded each time. One possible solution is to manually modify the corresponding model path to avoid redownloading.

hotpot-killer commented 1 year ago

in the latest code,There are no BLIP_MODELS = { 'base': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth', 'large': 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_large_caption.pth' }