hummingbot / dashboard

Application that helps you create, backtest, deploy, and manage Hummingbot instances
Apache License 2.0
162 stars 98 forks source link

OSError: [Errno 24] Too many open files #87

Open BuddhaSource opened 8 months ago

BuddhaSource commented 8 months ago

Describe the bug

OSError: [Errno 24] Too many open files: '/home/ubuntu/dashboard/pages/bot_orchestration/README.md' Traceback: File "/home/ubuntu/miniconda3/envs/dashboard/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script File "/home/ubuntu/dashboard/pages/bot_orchestration/app.py", line 24, in <module> File "/home/ubuntu/dashboard/utils/st_utils.py", line 23, in initialize_st_page File "/home/ubuntu/miniconda3/envs/dashboard/lib/python3.10/pathlib.py", line 1134, in read_text File "/home/ubuntu/miniconda3/envs/dashboard/lib/python3.10/pathlib.py", line 1119, in open

Hello everyone, so I am running 2 bot instances, and every few hours I get this error. I have to restart Streamlit every few hours. Tried rebooting the machine, too.

Steps to reproduce bug

  1. Custom wrote a new simple strategy, compiled from the source ( v1.20)
  2. Build and push custom docker Image
  3. Setup Dashboard on a new instance
  4. My instance is on Google Cloud, ubuntu with 24GB RAM and an 8core CPU.
  5. Started bot instance from the custom docker image URL
  6. Ran sudo chmod 666 /var/run/docker.sock
  7. Access the docker instance to configure the bot the first time
  8. Start the bots from the dashboard
  9. All runs smooth
  10. After a few hours ~12hr when I come back to the dashboard I see the "Too many files open " error
  11. I have to reboot Streamlit to fix it.
BuddhaSource commented 8 months ago

Got some recommendation from ChatGPT about the problem.

Here are some steps and considerations specific to Streamlit:

  1. State and Sessions: Streamlit is known to re-run the entire script every time an input changes. If you're creating new connections or opening files in your script without closing them, this could quickly exhaust your file descriptor limit. Consider if there's a way to manage resources more effectively given Streamlit's execution model.
  2. Optimize Resource Management: Make sure to close resources like files, database connections, or network connections when they are no longer needed. If you're frequently opening connections (like the MQTT connections indicated in your traceback), consider using a connection pool or a caching mechanism.
  3. Streamlit Caching: Streamlit provides a caching mechanism. If you're making expensive or resource-intensive calls, consider using @st.cache to cache the results. This can help reduce the number of times a resource-intensive function is called.
@st.cache
def expensive_function():
    # ... some code that opens files or connections
    return result
tonymontanov commented 8 months ago

I faced a similar problem. Sometimes refreshing the page helps. Please tell me, could this be due to the large number of instances created? Instances are not running

BuddhaSource commented 7 months ago

The problem gets solved once you restart Streamlit so one guess is that we are not closing the MQTT connections before making new one. In linux new connections are also treated as new files open.

We should review the architecture. For example, if the app opens a new MQTT client connection each time in a loop without closing previous ones, this could rapidly exhaust file descriptors

MohanaRuban619 commented 3 months ago

Is there any fix for this bug? I am facing an error when more than 5 to 6 instances are running at a time.

AzothZephyr commented 4 weeks ago

the os has a limit on the number of files that can be opened both system wide and by a user at any given time. you can adjust this limit by modifying the limits in /etc/security/limits.conf to a higher value.

run:

sudo vim /etc/security/limits.conf

if these values already exist uncommented, modify the value in the fourth column, else add the following values to file:

*    soft     nproc          65535
*    hard     nproc          65535
*    soft     nofile         65535
*    hard     nofile         65535

this changes the per user limit on the number of open files in ubuntu and other systemd based systems. if you run into the same issue again, bump the value in the fourth column higher.

AzothZephyr commented 4 weeks ago

it also might be worth reviewing the cause here too. im new to this repo but wondering if streamlit is improperly handling file closure. will keep an eye out as i deploy more bots.