meeb / tubesync

Syncs YouTube channels and playlists to a locally hosted media server
GNU Affero General Public License v3.0
1.74k stars 116 forks source link

Constantly running into 502 errors #454

Open kamtschatka opened 6 months ago

kamtschatka commented 6 months ago

I have set up tubesync to connect to an external database to improve performance and yet, when I am adding/deleting larger channels, I always get "502 bad gateway" errors.

Changing a single parameter fixes this issue and makes it work just fine: https://github.com/meeb/tubesync/blob/b11b667affcefaeb5eea8c762ba78f8774a84a4c/tubesync/tubesync/gunicorn.py#L25

Can this be updated to a more sensible value?

meeb commented 6 months ago

The issue here is not the time limit for gunicorn request execution it's that some actions which are slow on larger channels are processed in the active request and not offloaded to a background worker. Even if this was raised to 300 seconds or more there's still no assurance that absolutely massive channel update actions would complete in time. There are current efforts to rework this already underway. In the interim you can use some command line helpers to properly perform actions like deleting a large channel:

$ docker exec -ti tubesync python3 /app/manage.py delete-source [uuid of source here]

Of course if you want to increase the execution timeout in the interim as well you're welcome to do so, also note the nginx request has a timeout as well.

Makr91 commented 2 months ago

Sorry to necro this, but I can't run the command line tool of this:

python3 ./manage.py reset-tasks

does not work for me, here is what I get

root@local-tubesync:/app# python3 ./manage.py reset-tasks
2024-05-10 22:55:55,672 [tubesync/INFO] Resettings all tasks...
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/sqlite3/base.py", line 423, in execute
    return Database.Cursor.execute(self, query, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: no such column: sync_source.delete_files_on_disk

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/./manage.py", line 18, in <module>
    main()
  File "/app/./manage.py", line 14, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.11/dist-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.11/dist-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.11/dist-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.11/dist-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/sync/management/commands/reset-tasks.py", line 20, in handle
    for source in Source.objects.all():
  File "/usr/local/lib/python3.11/dist-packages/django/db/models/query.py", line 280, in __iter__
    self._fetch_all()
  File "/usr/local/lib/python3.11/dist-packages/django/db/models/query.py", line 1324, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/django/db/models/query.py", line 51, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/utils.py", line 79, in _execute
    with self.db.wrap_database_errors:
  File "/usr/local/lib/python3.11/dist-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/django/db/backends/sqlite3/base.py", line 423, in execute
    return Database.Cursor.execute(self, query, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django.db.utils.OperationalError: no such column: sync_source.delete_files_on_disk

When I run the Web Reset tasks, it actually deletes them, then starts to add them, I have tried to increase the time out. both in the nginx.conf

   fastcgi_read_timeout 900;
    proxy_connect_timeout 900s;
    proxy_send_timeout 900s;
    proxy_read_timeout 900s;
    client_body_timeout 900s;
    client_header_timeout 900s;
    send_timeout 900s;
    keepalive_timeout 900s;

and in the gunicorn.py

timeout = 3000

I am pretty patient person, so I am more than willing to wait for these processes to take a long time.

The problem I am having with the GUI is mentioned in many other issues, but I can't seem to get the command to run longer than 60 seconds even with all the settings I just mentioned being at 15 minutes or more (900 seconds)

I had 64 channels, added one, tried to delete it, it deleted the source, but then all the other subsequent tasks were "frozen". So I tried to reset the tasks via the GUI, and now I can't even do it via the CLI.

I think that if possible you should update documentation so that people don't accidentally use the UI to reset tasks. So they don't have to painstakingly recreate every single channel after curating their feed

Let me know if I can provide any more information to help, and I know there are plans to revamp this. So I am willing to help this along.

I am also not afraid of editing the SQL tables and columns, if its as easy as adding a column to a table, I can do that, but I am not very familiar with Gunicor and what exactly its trying to read/write from the tables for this project.

meeb commented 2 months ago

Somehow your database schema is out of date and you're missing at least one column. A schema update runs automatically on container start so either you're not running tunesync in a container or you've done a database rollback or something? Try running:

python3 manage.py migrate

Then reset-tasks should work. Your error was:

sqlite3.OperationalError: no such column: sync_source.delete_files_on_disk which should be pretty self-explanatory.

Makr91 commented 2 months ago

I am actually using https://github.com/jdeath/homeassistant-addons repo (which includes tubesync) for Home Assistant, with my own customizations.

It is indeed running in a container. I had went out of my way to add options to Jdeaths repo so that I can have it connect to the MariaDB Home Assistant Addon, since I was having more issues with sqllite (due to my Home Assistant disk i/o being overloaded, moving to an sql server on another Host, in a dedicated VM, solved my performance issues I was having. (I know I was asking too much of spinning rust to do all the add-ons I had installed))

But, doing what you said appears to have worked, All video sources are still showing up (thank --insert your holy deity here-- I didn't have to recreate all of them like I did 4 months ago), and all of the tasks are now slowly whittling away again.

I just wasn't sure which table I needed to update, but since I don't have to get my hands dirty with SQL, you saved me some time.

Thank you

meeb commented 2 months ago

tubesync is built with Django so you can just use any Django database management commands with tubesync. Django manages the schema, including upgrades etc. for you and has extensive easy to read documentation if you're inclined to do things yourself. You can always create a backup of your sources with manage.py dumpdata as well. With the SQL errors like no such column: sync_source.delete_files_on_disk the database table name is sync_source and the missing column name was delete_files_on_disk. Good to hear it was easy to fix for you.