jina-ai / dalle-flow

🌊 A Human-in-the-Loop workflow for creating HD images from text
grpcs://dalle-flow.dev.jina.ai
2.83k stars 211 forks source link

Failed to run my own server #1

Closed MattiaRigi97 closed 2 years ago

MattiaRigi97 commented 2 years ago

Hi, i'm trying to run my own server following the steps in the repo but I've the following error:

jinaerror

I'm on AWS Sagemaker console.. I tried with a ml.g4dn.16xlarge to avoid having resource issues as I thought it was related to that.

Can you help me to solve this issue?

windmaple commented 2 years ago

There is sth. weird with JAX installation. Could you do 'pip install -U jax[cuda] flax' and then run it again?

huanghaifeng1234 commented 2 years ago

Enviroment: docker image with 'Linux 3b017f71c659 4.15.0-88-generic' after running 'pip install -U jax[cuda] flax', then run 'jina flow --uses flow.yml', I get error: image image image

hanxiao commented 2 years ago

dalle flow is untested on CPU, also not recommend to run on CPU, it is too computational demanding.

Your error suggests you are trying to run JAX on a CPU machine

huanghaifeng1234 commented 2 years ago

Actually I'm running in a GPU machine, although I'm running in a docker container, which I run with argument 'docker run --gpus all'

---Original--- From: "Han @.> Date: Sat, May 7, 2022 22:12 PM To: @.>; Cc: @.**@.>; Subject: Re: [jina-ai/dalle-flow] Failed to run my own server (Issue #1)

dalle flow is untested on CPU, also not recommend to run on CPU, it is too computational demanding.

Your error suggests you are trying to run JAX on a CPU machine

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

hanxiao commented 2 years ago

maybe jax is too old? try pip install -U jax in docker?

windmaple commented 2 years ago

CPU definitely doesn't work. I can attest to that :)

hanxiao commented 2 years ago

There are some caveats you may encounter when deploying your own server:

huanghaifeng1234 commented 2 years ago

Hi, I have 8 GPU, and files and folders have been structured as described. After I do ' pip install -U jax' and upgrade jaxlab to 0.3.7, it generated error: image I searched this error in stackflow and I got this: isin error

I installed with 'pip install jax==0.3.4 jaxlib==0.3.2' but it didn't help, it generated this error: image

So I wonder which version of jax, jaxlib, flax, scipy you are using.

hanxiao commented 2 years ago

i see, let me work on an official Dockerfile today.

huanghaifeng1234 commented 2 years ago

Thanks for your help

wstrinz commented 2 years ago

I ran into just about the same sequence of errors trying to run this on a Debian 11 GCloud Compute Engine VM.

To get the PrecisionLike import working (after install the jax/jaxlib versions mentioned in https://github.com/jina-ai/dalle-flow/issues/1#issuecomment-1120607507), I installed the latest flax version from Github with:

pip install --upgrade git+https://github.com/google/flax.git

And now the jina flow builds/starts successfuly!

huanghaifeng1234 commented 2 years ago

install latest flax and jax version from github follewed by 'pip install --upgrade jaxlib' worked for me, thanks!

huanghaifeng1234 commented 2 years ago

after I deployed my own server and I changed the server_url to 'grpc://0.0.0.0:51005' in my client.py like image

server has received the request but the client.py was stuck like forever image

after I pressed Ctrl+C to stop it, I get trace back information: image

and I found that the output image was not saved as expected, whether my writing form of 'grpc://0.0.0.0:51005' is correct?

hanxiao commented 2 years ago

guys, im almost ready with my dockerfile, see #7 still last check.

hanxiao commented 2 years ago

yes! it works with CUDA 11.6 out of the box! I'm going to write some README and merge into master!

hanxiao commented 2 years ago

It's done! Give it try, open new issue if you have problems

huanghaifeng1234 commented 2 years ago

Nope, still need the official help

---Original--- From: "Jinke @.> Date: Wed, May 11, 2022 20:31 PM To: @.>; Cc: @.**@.>; Subject: Re: [jina-ai/dalle-flow] Failed to run my own server (Issue #1)

@huanghaifeng1234 Hi, I am also having this issue. The server received and processed the request but the client seemed to get stuck. Was this solved?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>