Closed SyedMuqtasidAli closed 3 months ago
Hi @SyedMuqtasidAli this issue is not very clear and does not provide minimal code example that we can use to reproduce the issue. See: https://stackoverflow.com/help/minimal-reproducible-example
@abidlabs here is complete code with complete scenario what I am doing and what I am facing: I created an gradio endpoint using codequen1.5-7B model here is code for this :
from transformers import AutoModelForCausalLM, AutoTokenizer from gradio import Interface import torch
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained( "Qwen/CodeQwen1.5-7B-Chat-AWQ", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/CodeQwen1.5-7B-Chat-AWQ")
def generate_code(prompt): """ Generates Python code using the CodeQwen model based on a prompt.
Args: prompt: A string containing the prompt describing the desired code.
Returns: A string containing the generated Python code. """ messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] return response
interface = Interface( fn=generate_code, inputs="text", outputs="text", title="Generate Python Code with CodeQwen", description="Describe the Python code you need and let CodeQwen assist you with generating the code!" )
interface.launch()
now then I want to integrate this gradio end-point with my crew ai agent through my custom class : here is code for this : import requests import logging import time
class GradioModelProxy: def init(self, url): self.url = url self.default_params = {} # Stores default parameters for the model logging.basicConfig(level=logging.DEBUG)
def __call__(self, input_text, **kwargs):
# This makes the instance callable
return self.generate(input_text, **kwargs)
def generate(self, input_text, **kwargs):
# Ensure input_text is a string
input_text = str(input_text)
# Prepare the request payload with default parameters included
data_to_send = {"data": [input_text], **self.default_params}
# Log the data being sent
logging.debug(f"Sending data to Gradio: {data_to_send}")
retries = 3
for attempt in range(retries):
try:
# Send the request to the Gradio endpoint with a 15-minute timeout
logging.debug(f"Attempt {attempt + 1} of {retries}")
start_time = time.time() # Record start time
response = requests.post(self.url, json=data_to_send, timeout=900) # 900 seconds = 15 minutes
end_time = time.time() # Record end time
# Calculate time taken for the request
duration = end_time - start_time
logging.debug(f"Request took {duration} seconds.")
# Check the response status code
if response.status_code == 200:
logging.debug("Received successful response from Gradio")
return response.json()['data'][0]
else:
logging.error(f"Failed to get valid response. Status Code: {response.status_code}")
logging.debug("Retrying...")
time.sleep(5) # Wait for 5 seconds before retrying
except requests.exceptions.RequestException as e:
logging.error(f"Request failed: {e}")
logging.debug("Retrying...")
time.sleep(5) # Wait for 5 seconds before retrying
raise Exception(f"Gradio model query failed after {retries} attempts")
def bind(self, **kwargs):
# Optionally implement this method if your system uses it to set up or configure models
self.default_params.update(kwargs)
logging.debug(f"Updated default parameters: {self.default_params}")
return self
from crewai import Agent, Task, Crew
gradio_url = "https://f74e223631febfd885.gradio.live/api/predict" llm = GradioModelProxy(gradio_url)
from crewai import Agent, Task, Crew, Process from langchain_community.llms import HuggingFaceHub import os os.environ["HUGGINGFACEHUB_API_TOKEN"] = "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"
researcher = Agent( role='Senior Developer', goal='complete code generation', backstory=( "You are a Senior developer" "Your expertise lies in generating code."
), verbose=True, allow_delegation=False, llm=llm )
task1 = Task( description=( "write a code of BFS in python " ), expected_output='write code', agent=researcher, human_input=False, )
crew = Crew( agents=[researcher], tasks=[task1], verbose=2 )
result = crew.kickoff()
now when I am infrencing it I am getting time out error 504 this error
Hi @SyedMuqtasidAli this code is not properly formatted nor is it minimal as it includes many other dependencies. Please provide a minimal example that our team can investigate, thanks
Closing for lack of a suitable repro
Describe the bug
i created an gradio endpoint using model " Codeqwen1.5-7B " but when i am infrencing through gradio end point with my crew ai i am getting this error "
is there any one please help me out this : i am using google colab (modle running on GPU ) i am using crew ai agents for infrencing -using hugging face model
Have you searched existing issues? 🔎
Reproduction
Screenshot
Logs
No response
System Info
Severity
I can work around it