jacobmarks / zero-shot-prediction-plugin

Run zero-shot prediction models on your data
30 stars 2 forks source link

Loading 8 bit quantization owl-vit model support in fifty one #1

Closed solomonmanuelraj closed 4 months ago

solomonmanuelraj commented 8 months ago

Hi team,

like to load the 8 bits quantized owl-vit model in fifty one.

############################################################################################ import fiftyone as fo import fiftyone.zoo as foz from fiftyone import ViewField as F from transformers import BitsAndBytesConfig

quant_dataset = foz.load_zoo_dataset( "coco-2017", split="validation", label_types=["detections"], max_samples=200, classes = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'], only_matching=True )

Loading the 8 bits quantized model

bnb_config = BitsAndBytesConfig( load_in_8bit=True)

model_type = "zero-shot-detection-transformer-torch" name_or_path = "google/owlvit-base-patch32" ## <- Owl-ViT

load model

quant_model = foz.load_zoo_model(model_type, name_or_path=name_or_path,quantization_config=bnb_config,device_map="auto")

can set classes at any time

quant_model.classes = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']

quant_dataset.apply_model(quant_model, label_field="owlvit_quant")

############################################################################################

whether it is supported in fifty one or not?

jacobmarks commented 8 months ago

Hey @solomonmanuelraj ,

Thanks for your interest, and great question!

The functionality of loading zero-shot models from the zoo is actually part of the core FiftyOne library, not this plugin. This plugin just provides a simple and streamlined interface specifically for zero-shot tasks.

As for your specific question, I think it should be possible to load the model directly from Hugging Face transformers (with the 8-bit quantization), and use the function convert_transformers_model from here. You can then set the classes and apply the model to your data. Want to give this a try?

jacobmarks commented 4 months ago

Closing this for now, feel free to reopen if you have further questions!