[Bug]: I try to train Textual Inversion Embedded on Mac M1 Pro got this error: failed assertion `destination datatype must be fp32'

jirayudech commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

Use Macbook M1 Pro Ram 16GB

Steps to reproduce the problem

Use Macbook M1 Pro Ram 16GB
Goto Training tab
Train embedding using sampling deterministic

What should have happened?

/AppleInternal/Library/BuildRoots/9941690d-bcf7-11ed-a645-863efbbaf80d/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSNDArray/Kernels/MPSNDArrayConvolutionA14.mm:4332: failed assertion `destination datatype must be fp32'

Commit where the problem happens

python: 3.10.9 • torch: 1.12.1 • xformers: N/A • gradio: 3.23.0 • commit: 22bcc7be • checkpoint: a4df55d292

What platforms do you use to access the UI ?

MacOS

What browsers do you use to access the UI ?

Microsoft Edge

Command Line Arguments

Arguments: ('task(woxvku9oqzv8ps7)', '', '0.003', 1, 1, '/Users/pwin/Downloads/Lisa/processed', 'textual_inversion', 512, 512, False, 1000, 'disabled', '0.1', True, 0, 'deterministic', False, 50, 50, 'style_filewords.txt', True, True, 'photo of Lisa Blackpink  <lora:iLalisa:1>, a woman as a movie star, modelshoot style, (extremely detailed CG unity 8k wallpaper), Intricate, High Detail, Sharp focus, dramatic,photorealistic painting art by midjourney and greg rutkowski , ((movie premiere gala)), ((standing on the red carpet)), ((paparazzi in the background)), (looking at viewer), (detailed pupils:1.2), (elegant black dress:1.3) dimly lit, ', 'NSFW, canvas frame, cartoon, 3d, ((disfigured)), ((bad art)), ((deformed)),((extra limbs)),((close up)),((b&w)), blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy, 3d render, (knees), (full body), (((nsfw)))', 40, 0, 6, -1.0, 512, 504) {}
Traceback (most recent call last):
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/textual_inversion/ui.py", line 33, in train_embedding
    embedding, filename = modules.textual_inversion.textual_inversion.train_embedding(*args)
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 362, in train_embedding
    validate_train_inputs(embedding_name, learn_rate, batch_size, gradient_step, data_root, template_file, template_filename, steps, save_embedding_every, create_image_every, log_directory, name="embedding")
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 335, in validate_train_inputs
    assert model_name, f"{name} not selected"
AssertionError: embedding not selected

Textual inversion embeddings loaded(9): bad_prompt_version2, easynegative, selenagomez, bad-hands-5, bad-artist-anime, bad-artist, dlalisa, iLalisa, bad_prompt
Training at rate of 0.003 until step 1000

List of extensions

Extension	URL	Version	Update
canvas-zoom	https://github.com/richrobber2/canvas-zoom.git	1f442bfc (Mon May 8 21:12:31 2023)	unknown
openpose-editor	https://github.com/fkunn1326/openpose-editor.git	d74fdd72 (Tue May 2 09:49:03 2023)	unknown
sd-webui-aspect-ratio-helper	https://github.com/thomasasfk/sd-webui-aspect-ratio-helper.git	b03cce20 (Mon Apr 10 21:17:00 2023)	unknown
sd-webui-controlnet	https://github.com/Mikubill/sd-webui-controlnet	c9c8ca6e (Sun May 7 17:15:00 2023)	unknown
sd-webui-photopea-embed	https://github.com/yankooliveira/sd-webui-photopea-embed	14d81ed9 (Sat May 6 09:25:11 2023)	unknown
sd-webui-text2video	https://github.com/deforum-art/sd-webui-text2video.git	26822483 (Thu May 4 18:26:12 2023)	unknown
LDSR	built-in
Lora	built-in
ScuNET	built-in
SwinIR	built-in
prompt-bracket-checker	built-in

Extension URL Version Update canvas-zoom https://github.com/richrobber2/canvas-zoom.git 1f442bfc (Mon May 8 21:12:31 2023) unknown openpose-editor https://github.com/fkunn1326/openpose-editor.git d74fdd72 (Tue May 2 09:49:03 2023) unknown sd-webui-aspect-ratio-helper https://github.com/thomasasfk/sd-webui-aspect-ratio-helper.git b03cce20 (Mon Apr 10 21:17:00 2023) unknown sd-webui-controlnet https://github.com/Mikubill/sd-webui-controlnet c9c8ca6e (Sun May 7 17:15:00 2023) unknown sd-webui-photopea-embed https://github.com/yankooliveira/sd-webui-photopea-embed 14d81ed9 (Sat May 6 09:25:11 2023) unknown sd-webui-text2video https://github.com/deforum-art/sd-webui-text2video.git 26822483 (Thu May 4 18:26:12 2023) unknown LDSR built-in
Lora built-in
ScuNET built-in
SwinIR built-in
prompt-bracket-checker built-in

Console logs

Arguments: ('task(woxvku9oqzv8ps7)', '', '0.003', 1, 1, '/Users/pwin/Downloads/Lisa/processed', 'textual_inversion', 512, 512, False, 1000, 'disabled', '0.1', True, 0, 'deterministic', False, 50, 50, 'style_filewords.txt', True, True, 'photo of Lisa Blackpink  <lora:iLalisa:1>, a woman as a movie star, modelshoot style, (extremely detailed CG unity 8k wallpaper), Intricate, High Detail, Sharp focus, dramatic,photorealistic painting art by midjourney and greg rutkowski , ((movie premiere gala)), ((standing on the red carpet)), ((paparazzi in the background)), (looking at viewer), (detailed pupils:1.2), (elegant black dress:1.3) dimly lit, ', 'NSFW, canvas frame, cartoon, 3d, ((disfigured)), ((bad art)), ((deformed)),((extra limbs)),((close up)),((b&w)), blurry, (((duplicate))), ((morbid)), ((mutilated)), [out of frame], extra fingers, mutated hands, ((poorly drawn hands)), ((poorly drawn face)), (((mutation))), (((deformed))), ((ugly)), blurry, ((bad anatomy)), (((bad proportions))), ((extra limbs)), cloned face, (((disfigured))), out of frame, ugly, extra limbs, (bad anatomy), gross proportions, (malformed limbs), ((missing arms)), ((missing legs)), (((extra arms))), (((extra legs))), mutated hands, (fused fingers), (too many fingers), (((long neck))), Photoshop, video game, ugly, tiling, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, mutation, mutated, extra limbs, extra legs, extra arms, disfigured, deformed, cross-eye, body out of frame, blurry, bad art, bad anatomy, 3d render, (knees), (full body), (((nsfw)))', 40, 0, 6, -1.0, 512, 504) {}
Traceback (most recent call last):
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/textual_inversion/ui.py", line 33, in train_embedding
    embedding, filename = modules.textual_inversion.textual_inversion.train_embedding(*args)
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 362, in train_embedding
    validate_train_inputs(embedding_name, learn_rate, batch_size, gradient_step, data_root, template_file, template_filename, steps, save_embedding_every, create_image_every, log_directory, name="embedding")
  File "/Users/pwin/Documents/gitrepo/stable-diffusion-webui/modules/textual_inversion/textual_inversion.py", line 335, in validate_train_inputs
    assert model_name, f"{name} not selected"
AssertionError: embedding not selected

Textual inversion embeddings loaded(9): bad_prompt_version2, easynegative, selenagomez, bad-hands-5, bad-artist-anime, bad-artist, dlalisa, iLalisa, bad_prompt
Training at rate of 0.003 until step 1000
Preparing dataset...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:04<00:00,  4.17it/s]
  0%|                                                                                                                         | 0/1000 [00:00<?, ?it/s]/AppleInternal/Library/BuildRoots/9941690d-bcf7-11ed-a645-863efbbaf80d/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSNDArray/Kernels/MPSNDArrayConvolutionA14.mm:4332: failed assertion `destination datatype must be fp32'
zsh: abort      ./webui.sh
(venv) (base) jirayudechdhammavanich@jirayudechs-MacBook-Pro stable-diffusion-webui % /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Additional information

No response

konstantin24121 commented 1 year ago

The same case, but after training start I get

/AppleInternal/Library/BuildRoots/2d7aff41-c4cb-11ed-a6bc-ae4f7fab34c4/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSNDArray/Kernels/MPSNDArrayConvolutionA14.mm:4332: failed assertion 'destination datatype must be fp32'
zsh: abort      bash webui.sh
(base)*****@MacBook-Pro-2 stable-diffusion-webui % /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

andrew-gil commented 1 year ago

I have the same error training a hypernetwork, on macbook air m1 16gb ram

/AppleInternal/Library/BuildRoots/9941690d-bcf7-11ed-a645-863efbbaf80d/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSNDArray/Kernels/MPSNDArrayConvolutionA14.mm:4332: failed assertion `destination datatype must be fp32'
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
zsh: abort      ./webui.sh

ThinksFast commented 1 year ago

I also get the There appear to be 1 leaked semaphore objects to clean up at shutdown error on an M1 Macbook.

And I'm unable to train a LORA successfully. I trained an embedding for 26 hours, never got an error in terminal, but after 100% of the steps were completed, the process hung and failed to output a model. Eventually I killed the process and got the semaphore error. Not sure if they're related or not.

But training a LORA on apple silicon is painfully slow and error prone 😢

novitae commented 1 year ago

I'm having the same error on a M2 Pro, and solved it by using --no-half arg when starting. However, the memory is getting robbed by i don't really know what since the performances are CATASTROPHIC (100 Hours for a 3000 steps training on 400 pictures on a m2 pro 16gb)