hpcaitech / EnergonAI

Large-scale model inference.
Apache License 2.0
631 stars 90 forks source link

Cannot run opt 125m examples with latest energonai docker images #186

Open zhanghaoie opened 1 year ago

zhanghaoie commented 1 year ago

Run with own build docker images with colossalai 0.2.0 as base image, no response for a very long time.

Run with latest energonai docker images, code version mismatch, raise following error:

[root@eb3f1650fdbf opt]# python3 opt_fastapi.py opt-125m Traceback (most recent call last): File "/workspace/EnergonAI/examples/opt/opt_fastapi.py", line 7, in from energonai import QueueFullError, launch_engine ImportError: cannot import name 'QueueFullError' from 'energonai' (/opt/conda/lib/python3.9/site-packages/energonai/init.py)

ver217 commented 1 year ago

Hi, here is a dockerfile I've tested:

FROM hpcaitech/colossalai:0.1.10-torch1.11-cu11.3

COPY opt-125m /data/opt-125m

# install energonai
RUN git clone https://github.com/hpcaitech/EnergonAI.git && \
    cd EnergonAI && \
    pip install -r requirements.txt && \
    pip install . && \
    cd examples/opt && \
    pip install -r requirements.txt

WORKDIR /workspace/EnergonAI/examples/opt

Put your opt-125m pretrained weights folder to ./opt-125m before building image.

darren-qiu commented 1 year ago

I also encountered the same problem.

Using the latest mirror, no response for a long time

python opt_fastapi.py opt-125m --max_batch_size=1

And also found that the system log is not output, it is difficult to locate the problem

Below is my client code:


import requests
import json as js

url = "http://127.0.0.1:7070/generation"

headers = {
    'Content-Type': 'application/json; charset=utf-8'
}

data = {
    'max_tokens': 64,
    'prompt': 'Question: Where were the 2004 Olympics held? Answer:',
    'top_k': 1,
    'top_p': 0.7,
    'temperature': 0.99
}

result = requests.post(url, data=js.dumps(data))
#result = requests.post(url, data=data, headers=headers)

print(result.json())