AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
143.12k stars 26.97k forks source link

[Bug]: OSError: Cannot find empty port in range: 7860-7860 with EC2 in Auto scaling group #16589

Open PiPyL opened 3 weeks ago

PiPyL commented 3 weeks ago

Checklist

What happened?

When I deploy source on a normal ec2, when starting ec2, it does not have this error. But when I deploy source on an ec2 in auto scaling group, it will have this error. Screenshot 2024-10-25 at 09 44 01

Steps to reproduce the problem

  1. Auto scaling group scale out 1 new ec2
  2. EC2 running
  3. SD start => error
  4. SD restart => success

What should have happened?

SD should start successfully instead of port error and SD will restart

What browsers do you use to access the UI ?

No response

Sysinfo

I use ec2 with instance type g6e.xlarge

Console logs

ct 25 01:35:30 ip-20-0-2-59.ec2.internal sh[843]: Launching launch.py...
Oct 25 01:35:30 ip-20-0-2-59.ec2.internal sh[843]: ################################################################
Oct 25 01:35:30 ip-20-0-2-59.ec2.internal sh[843]: glibc version is 2.34
Oct 25 01:35:30 ip-20-0-2-59.ec2.internal sh[843]: Cannot locate TCMalloc. Do you have tcmalloc or google-perftool installed on your system? (improves CPU memory usage)
Oct 25 01:35:49 ip-20-0-1-115.ec2.internal sh[949]: Python 3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0]
Oct 25 01:35:49 ip-20-0-1-115.ec2.internal sh[949]: Version: v1.6.0-1704-gc24ff95d
Oct 25 01:35:49 ip-20-0-1-115.ec2.internal sh[949]: Commit hash: c24ff95d305bf56e4afe5fdf76a5350481661c17
Oct 25 01:37:36 ip-20-0-1-115.ec2.internal sh[949]: CUDA 12.1
Oct 25 01:37:36 ip-20-0-1-115.ec2.internal sh[949]: Launching Web UI with arguments: --api --listen --cors-allow-origins '*' --port=7860
Oct 25 01:39:44 ip-20-0-1-115.ec2.internal sh[949]: no module 'xformers'. Processing without...
Oct 25 01:39:44 ip-20-0-1-115.ec2.internal sh[949]: no module 'xformers'. Processing without...
Oct 25 01:39:46 ip-20-0-1-115.ec2.internal sh[949]: No module 'xformers'. Proceeding without it.
Oct 25 01:40:08 ip-20-0-1-115.ec2.internal sh[949]: ControlNet preprocessor location: /home/ec2-user/stable-diffusion-webui/extensions/sd-webui-controlnet/annotator/downloads
Oct 25 01:40:24 ip-20-0-1-115.ec2.internal sh[949]: 2024-10-25 01:40:24,757 - ControlNet - INFO - ControlNet v1.1.455
Oct 25 01:40:38 ip-20-0-1-115.ec2.internal sh[949]: 01:40:38 - ReActor - STATUS - Running v0.7.1-b1 on Device: CUDA
Oct 25 01:40:38 ip-20-0-1-115.ec2.internal sh[949]: Loading weights [bc2f30f4ad] from /home/ec2-user/stable-diffusion-webui/models/Stable-diffusion/beautifulRealistic_v60.safetensors
Oct 25 01:40:41 ip-20-0-1-115.ec2.internal sh[949]: 2024-10-25 01:40:41,227 - ControlNet - INFO - ControlNet UI callback registered.
Oct 25 01:40:48 ip-20-0-1-115.ec2.internal sh[949]: Traceback (most recent call last):
Oct 25 01:40:48 ip-20-0-1-115.ec2.internal sh[949]:   File "/home/ec2-user/stable-diffusion-webui/launch.py", line 48, in <module>
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:     main()
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:   File "/home/ec2-user/stable-diffusion-webui/launch.py", line 44, in main
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:     start()
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:   File "/home/ec2-user/stable-diffusion-webui/modules/launch_utils.py", line 469, in start
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:     webui.webui()
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:   File "/home/ec2-user/stable-diffusion-webui/webui.py", line 79, in webui
Oct 25 01:40:49 ip-20-0-1-115.ec2.internal sh[949]:     app, local_url, share_url = shared.demo.launch(
Oct 25 01:40:50 ip-20-0-1-115.ec2.internal sh[949]:                                 ^^^^^^^^^^^^^^^^^^^
Oct 25 01:40:50 ip-20-0-1-115.ec2.internal sh[949]:   File "/home/ec2-user/stable-diffusion-webui/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1896, in launch
Oct 25 01:40:51 ip-20-0-1-115.ec2.internal sh[949]:     ) = networking.start_server(
Oct 25 01:40:52 ip-20-0-1-115.ec2.internal sh[949]:         ^^^^^^^^^^^^^^^^^^^^^^^^
Oct 25 01:40:52 ip-20-0-1-115.ec2.internal sh[949]:   File "/home/ec2-user/stable-diffusion-webui/venv/lib/python3.11/site-packages/gradio/networking.py", line 169, in start_server
Oct 25 01:40:52 ip-20-0-1-115.ec2.internal sh[949]:     raise OSError(
Oct 25 01:40:52 ip-20-0-1-115.ec2.internal sh[949]: OSError: Cannot find empty port in range: 7860-7860. You can specify a different port by setting the GRADIO_SERVER_PORT environment variable or passing the `server_port` parameter to `launch()`.
Oct 25 01:40:54 ip-20-0-1-115.ec2.internal sh[949]: Creating model from config: /home/ec2-user/stable-diffusion-webui/configs/v1-inference.yaml
Oct 25 01:41:43 ip-20-0-1-115.ec2.internal sh[949]: Applying attention optimization: Doggettx... done.
Oct 25 01:41:53 ip-20-0-1-115.ec2.internal sh[949]: Model loaded in 74.7s (load weights from disk: 15.3s, create model: 1.1s, apply weights to model: 48.5s, load textual inversion embeddings: 1.5s, calculate empty prompt: 8.1s).
Oct 25 01:42:07 ip-20-0-1-115.ec2.internal systemd[1]: start-sdw.service: Deactivated successfully.
Oct 25 01:42:07 ip-20-0-1-115.ec2.internal systemd[1]: start-sdw.service: Consumed 17.646s CPU time.
Oct 25 01:42:27 ip-20-0-1-115.ec2.internal systemd[1]: start-sdw.service: Scheduled restart job, restart counter is at 1.
Oct 25 01:43:56 ip-20-0-1-115.ec2.internal systemd[1]: Stopped Run stable diffusion webui.
Oct 25 01:43:56 ip-20-0-1-115.ec2.internal systemd[1]: start-sdw.service: Consumed 17.646s CPU time.
Oct 25 01:43:56 ip-20-0-1-115.ec2.internal systemd[1]: Started Run stable diffusion webui.

Additional information

No response

w-e-w commented 3 weeks ago

I don't know what you're doing and what's on your instance and I practically zero concept of how to use EC2

but if the issue is simply those parts are occupied by other instances other services then call obvious reasons it will not work

by default webui use the default pors range set by gradio whitch is 7860 it will try up to 100 subsequent ports if the port is not available

this behavior can be configured by using the is control by environment GRADIO_SERVER_PORT=7860 and GRADIO_NUM_PORTS=100 GRADIO_SERVER_PORT the starting port, GRADIO_NUM_PORTS the number of subsequent ports to try if unavailable which givs you a port range of 7860~7960

another way of specifying a different port is to use the--port launch command line arg see wiki https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Command-Line-Arguments-and-Settings note then --port will only use the specified port and will not try subsequent ports if you wish it to use a range port then it must use the environment variables