autodistill / autodistill-llava

LLaVA base model for use with Autodistill.
https://docs.autodistill.com
Apache License 2.0
6 stars 2 forks source link

NameError: name 'CaptionOntology' is not defined #7

Open shashi-netra opened 2 months ago

shashi-netra commented 2 months ago

I am trying to use this tool and run a simple script to test it out, but get this error. Am i I missing an import or something? Sorry if this is some oversight, but could use some help.

from autodistill_llava import LLaVA #, CaptionOntology

ontology=CaptionOntology(
NameError: name 'CaptionOntology' is not defined
Samuel5106 commented 2 months ago

@shashi-netra from autodistill.detection import CaptionOntology (just include this in your code)

shashi-mit commented 2 months ago

Would be great to add this to the Quickstart. Also now getting a new error: 'LLaVA' object is not callable. Would be great to do a refresh on the quickstart & documentation since these syntax issues continue to be an issue.

Samuel5106 commented 2 months ago

@shashi-mit To help me understand where the error is specifically occurring, could you please give the code and error screenshot?

shashi-netra commented 2 months ago

This is my code and I get this error 'LLaVA' object is not callable

from autodistill_llava import LLaVA#, CaptionOntology
import pprint as pp
from autodistill.detection import CaptionOntology
import requests
import av
import time

llava_model = LLaVA(
    ontology=CaptionOntology(
        {
            "people fighting or assaulting": "violence",
            "person smoking": "smoking",
            "person running": "running",
            "person falling": "falling",
            "person vandalizing": "vandalizing",
        }
    )
)

def read_video(video_url):
    frame_images = []
    with av.open(video_url) as container:
        stream = container.streams.video[0]
        for frame in container.decode(stream):
            frame_images.append(frame.to_image())
    return frame_images

def run_llava(video_file):
    frame_images = read_video(video_file)
    preds = llava_model(frame_images)
    return preds

def process_videos():
    video_files = [...]
    for video_file in video_files:
        try:
            print("[+] Processing video: ", video_file)
            start_time = time.time()
            # Run the model
            preds = run_llava(video_file)
            pp.pprint(preds)

            duration = time.time() - start_time
            print(f"Duration: {duration}")
        except Exception as e:
            print(e)
            continue
shashi-mit commented 1 month ago

@Samuel5106 can you please help with this error

Samuel5106 commented 1 month ago

@Samuel5106 can you please help with this error

It is incorrect to utilize the object llava_model as a function in your code, as suggested by the error 'LLaVA' object is not callable. This usually occurs when you attempt to call an object method instead of the object itself.

To remedy this, locate the appropriate technique to apply for predicting the frames by consulting the LLaVA class's implementation or documentation. If a predict or comparable method exists, you ought to use it rather than calling the object directly.

The run_llava function can be changed as follows to invoke the appropriate method on the llava_model object:

`from autodistill_llava import LLaVA import pprint as pp from autodistill.detection import CaptionOntology import requests import av import time

llava_model = LLaVA( ontology=CaptionOntology( { "people fighting or assaulting": "violence", "person smoking": "smoking", "person running": "running", "person falling": "falling", "person vandalizing": "vandalizing", } ) )

def read_video(video_url): frame_images = [] with av.open(video_url) as container: stream = container.streams.video[0] for frame in container.decode(stream): frame_images.append(frame.to_image()) return frame_images

def run_llava(video_file): frame_images = read_video(video_file)

Use the correct method for making predictions, e.g., predict

preds = llava_model.predict(frame_images)
return preds

def process_videos(): video_files = ["video1.mp4", "video2.mp4"] # Replace with actual video file paths for video_file in video_files: try: print("[+] Processing video: ", video_file) start_time = time.time()

Run the model

        preds = run_llava(video_file)
        pp.pprint(preds)

        duration = time.time() - start_time
        print(f"Duration: {duration}")
    except Exception as e:
        print(e)
        continue

`