M1 Mac users: Working `requirements.txt` set of dependencies and porting this code to M1 Mac, Python 3.9 (and update to Langchain 0.0.106)

swyxio commented 1 year ago

Edit: all the explorations have been recapped below: https://github.com/microsoft/visual-chatgpt/issues/37#issuecomment-1465469089

spent an hour fumbling around in dependency hell (e.g. #19) before giving up, deleting all deps and reinstalling latest versions of everything from scratch so here's my requirements dump

(new one thanks to @focus000 below)

accelerate==0.17.0
addict==2.4.0
albumentations==1.3.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
invisible-watermark==0.1.5
kornia==0.6.10
langchain==0.0.106
numpy==1.23.4
omegaconf==2.3.0
openai==0.24.0
opencv-contrib-python==4.7.0.72
open-clip-torch==2.16.0
prettytable==3.6.0
pytorch-lightning==1.6.5
safetensors==0.3.0
streamlit==1.20.0
streamlit-drawable-canvas==0.9.2
test-tube==0.7.5
timm==0.6.12
--pre
--extra-index-url https://download.pytorch.org/whl/nightly/cpu
torch==2.1.0.dev20230311
torchmetrics==0.11.3
torchvision==0.14.1
transformers==4.26.1
webdataset==0.2.39
yapf==0.32.0

if someone could get the intersection of the source requirement.txt and the list above that would be greaaat

EDIT: thanks chatgpt

``` albumentations==1.3.0 addict==2.4.0 basicsr==1.4.2 diffusers==0.14.0 einops==0.3.2 gradio==3.20.1 imageio==2.26.0 imageio-ffmpeg==0.4.8 kornia==0.6 langchain==0.0.101 numpy==1.23.1 omegaconf==2.1.1 opencv-contrib-python==4.4.0.46 open_clip_torch==2.0.2 pytorch-lightning==1.5.0 prettytable==3.6.0 safetensors==0.2.7 streamlit==1.12.1 streamlit-drawable-canvas==0.8.0 test-tube>=0.7.5 timmm==0.6.12 torch==1.12.1 torchmetrics==0.6.0 torchvision==0.13.1 transformers==4.26.1 webdataset==0.2.5 yapf==0.32.0 ```

i'm pretty sure some of these are not necessary since i see whisperx in there but idk how to clean it up

make sure you run the bash download.sh script in the readme before you patch the requirements

also make sure to export your OPENAI_API_KEY and PYTORCH_ENABLE_MPS_FALLBACK=1 (temp fix for using stable diffusion on m1 macs with this pytorch imp; see below)

swyxio commented 1 year ago

however now i get a AssertionError: Torch not compiled with CUDA enabled error

looks like these lines

https://github.com/microsoft/visual-chatgpt/blob/main/visual_chatgpt.py#L804-824

are all hardcoded to cuda. anyone know how to convert these to mac equivalents?

swyxio commented 1 year ago

i think according ot https://github.com/remixer-dec/llama-mps/commit/9a8970299bf708a86b6dfa0c115e59ea2205c0e3 you can just make it .to('mps')

swyxio commented 1 year ago

yup it works lmao. https://twitter.com/swyxio/status/1634120614830284800

i could publish a fork if people are interested but litreally follow the insturctions above thats all i did

Edit: welp another problem: image2line() calls MLSDdetector() which is cuda only

diving into the source Controlnet/annotator/mlsd/__init__.py i think you have to change self.model = model.cuda().eval() to self.model = model.to('mps').eval(). (learned from this controlnet PR, chatgpt or perplexity were useless for this)

this seems to work

then you also do it for:

/ControlNet/annotator/hed/__init__.py:103: self.netNetwork = Network(modelpath).to('mps').eval()
- and image_hed = torch.from_numpy(input_image).float().to('mps')
/ControlNet/annotator/uniformer/__init__.py:18: self.model = init_segmentor(config_file, modelpath, 'mps')
/ControlNet/annotator/midas/__init__.py:11: self.model = MiDaSInference(model_type="dpt_hybrid").to('mps')
- image_depth = torch.from_numpy(image_depth).float().to('mps')
/ControlNet/cldm/ddim_hacked.py, line 20: attr = attr.to(torch.device("mps"))

as well

whispy commented 1 year ago

@sw-yx , in visual_chatgpt.py, when you say "make it to .to('mps'), do you mean like so:

        self.edit = ImageEditing.to('mps')
        self.i2t = ImageCaptioning.to('mps')
        self.t2i = T2I.to('mps')
        self.image2canny = image2canny()
        self.canny2image = canny2image.to('mps')
        self.image2line = image2line()
        self.line2image = line2image.to('mps')
        self.image2hed = image2hed()
        self.hed2image = hed2image.to('mps')
        self.image2scribble = image2scribble()
        self.scribble2image = scribble2image.to('mps')
        self.image2pose = image2pose()
        self.pose2image = pose2image.to('mps')
        self.BLIPVQA = BLIPVQA.to('mps')
        self.image2seg = image2seg()
        self.seg2image = seg2image.to('mps')
        self.image2depth = image2depth()
        self.depth2image = depth2image.to('mps')
        self.image2normal = image2normal()
        self.normal2image = normal2image.to('mps')
        self.pix2pix = Pix2Pix.to('mps')

Am I missing any lines? Is that how you have it?

swyxio commented 1 year ago

yeah. there are more steps after that that i'm still figuring out but am making slow progress (see above edited comment)

swyxio commented 1 year ago

ok was able to eliminate all errors with that simple porting strategy... but its a pretty heavy download...

swyxio commented 1 year ago

it now runs but has ANOTHER runtime error lmoa

swyxio commented 1 year ago

there was a large langchain memory refactor in v0.0.103 - this project uses 101, i have 106. i am trying to just update it since langchain will just keep breaking stuff

for now just commenting out offending line

# self.agent.memory.buffer = cut_dialogue_history(self.agent.memory.buffer, keep_last_n_words=500)

and replacing the buffer editing

        # self.agent.memory.buffer = self.agent.memory.buffer + Human_prompt + 'AI: ' + AI_prompt
        self.agent.memory.save_context({"input": Human_prompt}, {"output": AI_prompt})

(fixed thanks to @focus000 below)

swyxio commented 1 year ago

i can now get the webapp running but basically every command i try results in an error.

for stable diffusion

have to say generate _____ as your instruction or it will infer the wrong intent.

make sure to export PYTORCH_ENABLE_MPS_FALLBACK=1 to deal with below

  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 997, in prepare_inputs_for_generation
    position_ids = attention_mask.long().cumsum(-1) - 1
NotImplementedError: The operator 'aten::cumsum.out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

however it runs into

 File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/transformers/generation/logits_process.py", line 297, in __call__
    indices_to_remove = scores < torch.topk(scores, top_k)[0][..., -1, None]
RuntimeError: Currently topk on mps works only for k<=16

for masking

currently facing this crap

for pix2pix

"make it look like a painting"

seems to work tho not suuuper well

image 2 text

works but sucks

BLIPVQA

edge detection

pose2image

plong0723 commented 1 year ago

@sw-yx Excuse me, the question that size of a tensor musht match. have been solved?

focus000 commented 1 year ago

@sw-yx Hi, Thx for your work. I have running web app successfully on m2 max Mac.

For stable diffusion, according to this issue just upgrade PyTorch to latest nightly will support mps topk > 16, and support for cumsum int64 was added in MacOS 13.3.

BTW, I simplify your requirements.txt here, note that torch version I replaced with nightly version:

accelerate==0.17.0
addict==2.4.0
albumentations==1.3.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
invisible-watermark==0.1.5
kornia==0.6.10
langchain==0.0.106
numpy==1.23.4
omegaconf==2.3.0
openai==0.24.0
opencv-contrib-python==4.7.0.72
open-clip-torch==2.16.0
prettytable==3.6.0
pytorch-lightning==1.6.5
safetensors==0.3.0
streamlit==1.20.0
streamlit-drawable-canvas==0.9.2
test-tube==0.7.5
timm==0.6.12
--pre
--extra-index-url https://download.pytorch.org/whl/nightly/cpu
torch==2.1.0.dev20230311
torchmetrics==0.11.3
torchvision==0.14.1
transformers==4.26.1
webdataset==0.2.39
yapf==0.32.0

focus000 commented 1 year ago

there was a large langchain memory refactor in v0.0.103 - this project uses 101, i have 106. i am trying to just update it since langchain will just keep breaking stuff

for now just commenting out offending line

# self.agent.memory.buffer = cut_dialogue_history(self.agent.memory.buffer, keep_last_n_words=500)

and replacing the buffer editing
        # self.agent.memory.buffer = self.agent.memory.buffer + Human_prompt + 'AI: ' + AI_prompt
        self.agent.memory.buffer.save_context({"input": Human_prompt}, {"output": AI_prompt})

should be self.agent.memory.save_context?

quang-m-nguyen commented 1 year ago

@sw-yx @focus000 would you guys be able post what you have to a branch please? Im running into countless dependencies problem despite using the latest requirements.txt

c4rl0sm3nd3s commented 1 year ago

which macbook m1 are you using to run this code? how many ram do you have on it? I am having issues running using my windows pc with RTX 2060 12gb so I was wondering if my macbook air M1 8gb will be able to handle

gongxh13 commented 1 year ago

@sw-yx Hi, Thx for your work. I have running web app successfully on m2 max Mac.

For stable diffusion, according to this issue just upgrade PyTorch to latest nightly will support mps topk > 16, and support for cumsum int64 was added in MacOS 13.3.

BTW, I simplify your requirements.txt here, note that torch version I replaced with nightly version:
accelerate==0.17.0
addict==2.4.0
albumentations==1.3.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
invisible-watermark==0.1.5
kornia==0.6.10
langchain==0.0.106
numpy==1.23.4
omegaconf==2.3.0
openai==0.24.0
opencv-contrib-python==4.7.0.72
open-clip-torch==2.16.0
prettytable==3.6.0
pytorch-lightning==1.6.5
safetensors==0.3.0
streamlit==1.20.0
streamlit-drawable-canvas==0.9.2
test-tube==0.7.5
timm==0.6.12
torch==2.1.0.dev20230311
torchmetrics==0.11.3
torchvision==0.14.1
transformers==4.26.1
webdataset==0.2.39
yapf==0.32.0

Good

jacobmlloyd commented 1 year ago

@sw-yx Hi, Thx for your work. I have running web app successfully on m2 max Mac.

For stable diffusion, according to this issue just upgrade PyTorch to latest nightly will support mps topk > 16, and support for cumsum int64 was added in MacOS 13.3.

BTW, I simplify your requirements.txt here, note that torch version I replaced with nightly version:
accelerate==0.17.0
addict==2.4.0
albumentations==1.3.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
invisible-watermark==0.1.5
kornia==0.6.10
langchain==0.0.106
numpy==1.23.4
omegaconf==2.3.0
openai==0.24.0
opencv-contrib-python==4.7.0.72
open-clip-torch==2.16.0
prettytable==3.6.0
pytorch-lightning==1.6.5
safetensors==0.3.0
streamlit==1.20.0
streamlit-drawable-canvas==0.9.2
test-tube==0.7.5
timm==0.6.12
torch==2.1.0.dev20230311
torchmetrics==0.11.3
torchvision==0.14.1
transformers==4.26.1
webdataset==0.2.39
yapf==0.32.0

Can I ask how you how you got it to work with the match from there? Using the newest Nightly install I'm getting "RuntimeError: MPS does not support cumsum op with int64 input. Support has been added in macOS 13.3" On M1 Pro.

gongxh13 commented 1 year ago

@sw-yx Hi, Thx for your work. I have running web app successfully on m2 max Mac.

For stable diffusion, according to this issue just upgrade PyTorch to latest nightly will support mps topk > 16, and support for cumsum int64 was added in MacOS 13.3.

BTW, I simplify your requirements.txt here, note that torch version I replaced with nightly version:
accelerate==0.17.0
addict==2.4.0
albumentations==1.3.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
invisible-watermark==0.1.5
kornia==0.6.10
langchain==0.0.106
numpy==1.23.4
omegaconf==2.3.0
openai==0.24.0
opencv-contrib-python==4.7.0.72
open-clip-torch==2.16.0
prettytable==3.6.0
pytorch-lightning==1.6.5
safetensors==0.3.0
streamlit==1.20.0
streamlit-drawable-canvas==0.9.2
test-tube==0.7.5
timm==0.6.12
torch==2.1.0.dev20230311
torchmetrics==0.11.3
torchvision==0.14.1
transformers==4.26.1
webdataset==0.2.39
yapf==0.32.0

Why I can not found your torch version?

ERROR: Could not find a version that satisfies the requirement torch==2.1.0.dev20230311 (from versions: 1.9.0, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1)
ERROR: No matching distribution found for torch==2.1.0.dev20230311

focus000 commented 1 year ago

@jacobmlloyd my apologies for that, it seems mps support cumsum op still not work on macos13.3. lately I use cpu on T2I, and it works. I have modified following code:

visual_chatgpt.py:806 : self.t2i = T2I(device="cpu")
visual_chatgpt.py:190 : self.pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float32)

and now the web app can generate image

focus000 commented 1 year ago

@gongxh13 please use the newer version

Designdocs commented 1 year ago

@sw-yx 嗨，谢谢你的工作。我已经在 m2 max Mac 上成功运行了网络应用程序。

为了稳定扩散，根据这个问题只需将 PyTorch 升级到最新的 nightly 即可支持 mps topk > 16，并且在 MacOS 13.3 中添加了对 cumsum int64 的支持。

顺便说一句，我在这里简化了你的 requirements.txt，注意torch我用夜间版本替换的版本：
accelerate==0.17.0
addict==2.4.0
albumentations==1.3.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
invisible-watermark==0.1.5
kornia==0.6.10
langchain==0.0.106
numpy==1.23.4
omegaconf==2.3.0
openai==0.24.0
opencv-contrib-python==4.7.0.72
open-clip-torch==2.16.0
prettytable==3.6.0
pytorch-lightning==1.6.5
safetensors==0.3.0
streamlit==1.20.0
streamlit-drawable-canvas==0.9.2
test-tube==0.7.5
timm==0.6.12
--pre
--extra-index-url https://download.pytorch.org/whl/nightly/cpu
torch==2.1.0.dev20230311
torchmetrics==0.11.3
torchvision==0.14.1
transformers==4.26.1
webdataset==0.2.39
yapf==0.32.0

ERROR: Cannot install -r requirement.txt (line 1), -r requirement.txt (line 10), -r requirement.txt (line 11), -r requirement.txt (line 17), -r requirement.txt (line 19), -r requirement.txt (line 23), -r requirement.txt (line 24), -r requirement.txt (line 28), -r requirement.txt (line 29), -r requirement.txt (line 4) and torch==2.1.0.dev20230311 because these package versions have conflicting dependencies.

The conflict is caused by: The user requested torch==2.1.0.dev20230311 accelerate 0.17.0 depends on torch>=1.4.0 basicsr 1.4.2 depends on torch>=1.7 invisible-watermark 0.1.5 depends on torch kornia 0.6.10 depends on torch>=1.9.1 open-clip-torch 2.16.0 depends on torch>=1.9.0 pytorch-lightning 1.6.5 depends on torch>=1.8.* test-tube 0.7.5 depends on torch>=1.1.0 timm 0.6.12 depends on torch>=1.7 torchmetrics 0.11.3 depends on torch>=1.8.1 torchvision 0.14.1 depends on torch==1.13.1

To fix this you could try to:

loosen the range of package versions you've specified
remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

reverie-dev commented 1 year ago

self.edit = ImageEditing.to('mps') self.i2t = ImageCaptioning.to('mps') self.t2i = T2I.to('mps') self.image2canny = image2canny() self.canny2image = canny2image.to('mps') self.image2line = image2line() self.line2image = line2image.to('mps') self.image2hed = image2hed() self.hed2image = hed2image.to('mps') self.image2scribble = image2scribble() self.scribble2image = scribble2image.to('mps') self.image2pose = image2pose() self.pose2image = pose2image.to('mps') self.BLIPVQA = BLIPVQA.to('mps') self.image2seg = image2seg() self.seg2image = seg2image.to('mps') self.image2depth = image2depth() self.depth2image = depth2image.to('mps') self.image2normal = image2normal() self.normal2image = normal2image.to('mps') self.pix2pix = Pix2Pix.to('mps')

AttributeError: type object 'ImageEditing' has no attribute 'to'

jordank195 commented 1 year ago

@whispy Almost - I think what you actually need is

        self.edit = ImageEditing(device="mps")
        self.i2t = ImageCaptioning(device="mps")
        self.t2i = T2I(device="cpu")
        self.image2canny = image2canny()
        self.canny2image = canny2image(device="mps")
        self.image2line = image2line()
        self.line2image = line2image(device="mps")
        self.image2hed = image2hed()
        self.hed2image = hed2image(device="mps")
        self.image2scribble = image2scribble()
        self.scribble2image = scribble2image(device="mps")
        self.image2pose = image2pose()
        self.pose2image = pose2image(device="mps")
        self.BLIPVQA = BLIPVQA(device="mps")
        self.image2seg = image2seg()
        self.seg2image = seg2image(device="mps")
        self.image2depth = image2depth()
        self.depth2image = depth2image(device="mps")
        self.image2normal = image2normal()
        self.normal2image = normal2image(device="mps")
        self.pix2pix = Pix2Pix(device="mps")

Edit: as indicated by @focus000 you need to run T2I on cpu. Make sure to change line 190 to float32.

jordank195 commented 1 year ago

Summary of changes you need to make it work:

1. Use @sw-yx 's version of the requirements

albumentations==1.3.0
addict==2.4.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
kornia==0.6
langchain==0.0.101
numpy==1.23.1
omegaconf==2.1.1
opencv-contrib-python==4.4.0.46
open_clip_torch==2.0.2
pytorch-lightning==1.5.0
prettytable==3.6.0
safetensors==0.2.7
streamlit==1.12.1
streamlit-drawable-canvas==0.8.0
test-tube>=0.7.5
timmm==0.6.12
torch==1.12.1
torchmetrics==0.6.0
torchvision==0.13.1
transformers==4.26.1
webdataset==0.2.5
yapf==0.32.0
i'm pretty sure some of these are not necessary since i see whisperx in there but idk how to clean it up

make sure you run the bash download.sh script in the readme before you patch the requirements

also make sure to export your OPENAI_API_KEY and PYTORCH_ENABLE_MPS_FALLBACK=1 (temp fix for using stable diffusion on m1 macs with this pytorch imp; see below)

2. Make the changes in these files (Thanks again @sw-yx )

Edit: welp another problem: image2line() calls MLSDdetector() which is cuda only

diving into the source Controlnet/annotator/mlsd/__init__.py i think you have to change self.model = model.cuda().eval() to self.model = model.to('mps').eval(). (learned from this controlnet PR, chatgpt or perplexity were useless for this)

this seems to work

then you also do it for:

/ControlNet/annotator/hed/__init__.py:103: self.netNetwork = Network(modelpath).to('mps').eval()

and image_hed = torch.from_numpy(input_image).float().to('mps')

/ControlNet/annotator/uniformer/__init__.py:18: self.model = init_segmentor(config_file, modelpath, 'mps')

/ControlNet/annotator/midas/__init__.py:11: self.model = MiDaSInference(model_type="dpt_hybrid").to('mps')

image_depth = torch.from_numpy(image_depth).float().to('mps')

/ControlNet/cldm/ddim_hacked.py, line 20: attr = attr.to(torch.device("mps"))

as well

3. Change this block in lines 804-824 in visual_chatgpt.py

        self.edit = ImageEditing(device="mps")
        self.i2t = ImageCaptioning(device="mps")
        self.t2i = T2I(device="cpu")
        self.image2canny = image2canny()
        self.canny2image = canny2image(device="mps")
        self.image2line = image2line()
        self.line2image = line2image(device="mps")
        self.image2hed = image2hed()
        self.hed2image = hed2image(device="mps")
        self.image2scribble = image2scribble()
        self.scribble2image = scribble2image(device="mps")
        self.image2pose = image2pose()
        self.pose2image = pose2image(device="mps")
        self.BLIPVQA = BLIPVQA(device="mps")
        self.image2seg = image2seg()
        self.seg2image = seg2image(device="mps")
        self.image2depth = image2depth()
        self.depth2image = depth2image(device="mps")
        self.image2normal = image2normal()
        self.normal2image = normal2image(device="mps")
        self.pix2pix = Pix2Pix(device="mps")

Edit: as indicated by @focus000 you need to run T2I on cpu. Make sure to change line 190 to float32.

4. Change line 190 in visual_chatgpt.py (thanks @focus000 )

it seems mps support cumsum op still not work on macos13.3. lately I use cpu on T2I, and it works. I have modified following code:
visual_chatgpt.py:806 : self.t2i = T2I(device="cpu")
visual_chatgpt.py:190 : self.pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float32)
and now the web app can generate image

5. Run the py file and question if it was worth all the effort lol

wastu01 commented 1 year ago

did not work and get this problem same below https://github.com/microsoft/visual-chatgpt/issues/187

Omaraldarwish commented 1 year ago

what bash script are you guys talking about, can't find it in the repo!

@sw-yx @jordankeyton

swyxio commented 1 year ago

it was removed https://github.com/microsoft/visual-chatgpt/commit/feb5dad2514f36b61a34fe805b39df4db7298410

i expect the instructions here to break further over time, im done playing with this repo, pls file other issues u need

taohaowei commented 1 year ago

Summary of changes you need to make it work:

1. Use @sw-yx 's version of the requirements
albumentations==1.3.0
addict==2.4.0
basicsr==1.4.2
diffusers==0.14.0
einops==0.3.2
gradio==3.20.1
imageio==2.26.0
imageio-ffmpeg==0.4.8
kornia==0.6
langchain==0.0.101
numpy==1.23.1
omegaconf==2.1.1
opencv-contrib-python==4.4.0.46
open_clip_torch==2.0.2
pytorch-lightning==1.5.0
prettytable==3.6.0
safetensors==0.2.7
streamlit==1.12.1
streamlit-drawable-canvas==0.8.0
test-tube>=0.7.5
timmm==0.6.12
torch==1.12.1
torchmetrics==0.6.0
torchvision==0.13.1
transformers==4.26.1
webdataset==0.2.5
yapf==0.32.0
i'm pretty sure some of these are not necessary since i see whisperx in there but idk how to clean it up make sure you run the bash download.sh script in the readme before you patch the requirements also make sure to export your OPENAI_API_KEY and PYTORCH_ENABLE_MPS_FALLBACK=1 (temp fix for using stable diffusion on m1 macs with this pytorch imp; see below)
2. Make the changes in these files (Thanks again @sw-yx )

Edit: welp another problem: image2line() calls MLSDdetector() which is cuda only diving into the source Controlnet/annotator/mlsd/__init__.py i think you have to change self.model = model.cuda().eval() to self.model = model.to('mps').eval(). (learned from this controlnet PR, chatgpt or perplexity were useless for this) this seems to work then you also do it for:

/ControlNet/annotator/hed/__init__.py:103: self.netNetwork = Network(modelpath).to('mps').eval()

and image_hed = torch.from_numpy(input_image).float().to('mps')

/ControlNet/annotator/uniformer/__init__.py:18: self.model = init_segmentor(config_file, modelpath, 'mps')

/ControlNet/annotator/midas/__init__.py:11: self.model = MiDaSInference(model_type="dpt_hybrid").to('mps')

image_depth = torch.from_numpy(image_depth).float().to('mps')

/ControlNet/cldm/ddim_hacked.py, line 20: attr = attr.to(torch.device("mps"))

as well

3. Change this block in lines 804-824 in visual_chatgpt.py
        self.edit = ImageEditing(device="mps")
        self.i2t = ImageCaptioning(device="mps")
        self.t2i = T2I(device="cpu")
        self.image2canny = image2canny()
        self.canny2image = canny2image(device="mps")
        self.image2line = image2line()
        self.line2image = line2image(device="mps")
        self.image2hed = image2hed()
        self.hed2image = hed2image(device="mps")
        self.image2scribble = image2scribble()
        self.scribble2image = scribble2image(device="mps")
        self.image2pose = image2pose()
        self.pose2image = pose2image(device="mps")
        self.BLIPVQA = BLIPVQA(device="mps")
        self.image2seg = image2seg()
        self.seg2image = seg2image(device="mps")
        self.image2depth = image2depth()
        self.depth2image = depth2image(device="mps")
        self.image2normal = image2normal()
        self.normal2image = normal2image(device="mps")
        self.pix2pix = Pix2Pix(device="mps")
Edit: as indicated by @focus000 you need to run T2I on cpu. Make sure to change line 190 to float32.
4. Change line 190 in visual_chatgpt.py (thanks @focus000 )
it seems mps support cumsum op still not work on macos13.3. lately I use cpu on T2I, and it works. I have modified following code:
visual_chatgpt.py:806 : self.t2i = T2I(device="cpu")
visual_chatgpt.py:190 : self.pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float32)
and now the web app can generate image
5. Run the py file and question if it was worth all the effort lol

warn：timmm has an extra m

wastu01 commented 1 year ago

i can now get the webapp running but basically every command i try results in an error.

for stable diffusion

have to say generate _____ as your instruction or it will infer the wrong intent.

make sure to export PYTORCH_ENABLE_MPS_FALLBACK=1 to deal with below
  File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 997, in prepare_inputs_for_generation
    position_ids = attention_mask.long().cumsum(-1) - 1
NotImplementedError: The operator 'aten::cumsum.out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
however it runs into
 File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/transformers/generation/logits_process.py", line 297, in __call__
    indices_to_remove = scores < torch.topk(scores, top_k)[0][..., -1, None]
RuntimeError: Currently topk on mps works only for k<=16
for masking

currently facing this crap

for pix2pix

"make it look like a painting"

seems to work tho not suuuper well

image 2 text

works but sucks

BLIPVQA

edge detection

pose2image

btw I use it for changing Clothes

https://hackmd.io/@DCT/Microsoft-Visual-ChatGPT

KabirArora76 commented 1 year ago

Is there a fork of this version that can run on a M1 mac? I have been trying to get it to work but I think I keep missing something.

chenfei-wu / TaskMatrix