Once your YouTube video collection grows, it becomes hard to search and find a specific video. That's where Tube Archivist comes in: By indexing your video collection with metadata from YouTube, you can organize, search and enjoy your archived YouTube videos without hassle offline through a convenient web interface. This includes:
For minimal system requirements, the Tube Archivist stack needs around 2GB of available memory for a small testing setup and around 4GB of available memory for a mid to large sized installation. Minimal with dual core with 4 threads, better quad core plus. This project requires docker. Ensure it is installed and running on your system.
The documentation has additional user provided instructions for Unraid, Synology and Podman.
The instructions here should get you up and running quickly, for Docker beginners and full explanation about each environment variable, see the docs.
Take a look at the example docker-compose.yml and configure the required environment variables.
TubeArchivist: | Environment Var | Value | |
---|---|---|---|
TA_HOST | Server IP or hostname | Required | |
TA_USERNAME | Initial username when logging into TA | Required | |
TA_PASSWORD | Initial password when logging into TA | Required | |
ELASTIC_PASSWORD | Password for ElasticSearch | Required | |
REDIS_HOST | Hostname for Redis | Required | |
TZ | Set your timezone for the scheduler | Required | |
TA_PORT | Overwrite Nginx port | Optional | |
TA_UWSGI_PORT | Overwrite container internal uwsgi port | Optional | |
TA_ENABLE_AUTH_PROXY | Enables support for forwarding auth in reverse proxies | Read more | |
TA_AUTH_PROXY_USERNAME_HEADER | Header containing username to log in | Optional | |
TA_AUTH_PROXY_LOGOUT_URL | Logout URL for forwarded auth | Optional | |
ES_URL | URL That ElasticSearch runs on | Optional | |
ES_DISABLE_VERIFY_SSL | Disable ElasticSearch SSL certificate verification | Optional | |
ES_SNAPSHOT_DIR | Custom path where elastic search stores snapshots for master/data nodes | Optional | |
HOST_GID | Allow TA to own the video files instead of container user | Optional | |
HOST_UID | Allow TA to own the video files instead of container user | Optional | |
ELASTIC_USER | Change the default ElasticSearch user | Optional | |
REDIS_PORT | Port that Redis runs on | Optional | |
TA_LDAP | Configure TA to use LDAP Authentication | Read more | |
ENABLE_CAST | Enable casting support | Read more | |
DJANGO_DEBUG | Return additional error messages, for debug only |
ElasticSearch | Environment Var | Value | State |
---|---|---|---|
ELASTIC_PASSWORD | Matching password ELASTIC_PASSWORD from TubeArchivist |
Required | |
http.port | Change the port ElasticSearch runs on | Optional |
Always use the latest (the default) or a named semantic version tag for the docker images. The unstable tags are only for your testing environment, there might not be an update path for these testing builds.
You will see the current version number of Tube Archivist in the footer of the interface. There is a daily version check task querying tubearchivist.com, notifying you of any new releases in the footer. To update, you need to update the docker images, the method for which will depend on your platform. For example, if you're using docker-compose
, run docker-compose pull
and then restart with docker-compose up -d
. After updating, check the footer to verify you are running the expected version.
bbilly1/tubearchivist-es
to automatically get the recommended version. bestvideo[vcodec*=avc1]+bestaudio[acodec*=mp4a]/mp4
If you have a collision on port 8000
, best solution is to use dockers HOST_PORT and CONTAINER_PORT distinction: To for example change the interface to port 9000 use 9000:8000
in your docker-compose file.
For more information on port collisions, check the docs.
Here is a list of common errors and their solutions.
vm.max_map_count
Elastic Search in Docker requires the kernel setting of the host machine vm.max_map_count
to be set to at least 262144.
To temporary set the value run:
sudo sysctl -w vm.max_map_count=262144
To apply the change permanently depends on your host operating system:
vm.max_map_count = 262144
to the file /etc/sysctl.conf
./etc/sysctl.d/max_map_count.conf
with the content vm.max_map_count = 262144
. If you see a message similar to Unable to access 'path.repo' (/usr/share/elasticsearch/data/snapshot)
or failed to obtain node locks, tried [/usr/share/elasticsearch/data]
and maybe these locations are not writable
when initially starting elasticsearch, that probably means the container is not allowed to write files to the volume.
To fix that issue, shutdown the container and on your host machine run:
chown 1000:0 -R /path/to/mount/point
This will match the permissions with the UID and GID of elasticsearch process within the container and should fix the issue.
The Elasticsearch index will turn to read only if the disk usage of the container goes above 95% until the usage drops below 90% again, you will see error messages like disk usage exceeded flood-stage watermark
.
Similar to that, TubeArchivist will become all sorts of messed up when running out of disk space. There are some error messages in the logs when that happens, but it's best to make sure to have enough disk space before starting to download.
error setting rlimit
If you are seeing errors like failed to create shim: OCI runtime create failed
and error during container init: error setting rlimits
, this means docker can't set these limits, usually because they are set at another place or are incompatible because of other reasons. Solution is to remove the ulimits
key from the ES container in your docker compose and start again.
This can happen if you have nested virtualizations, e.g. LXC running Docker in Proxmox.
We have come far, nonetheless we are not short of ideas on how to improve and extend this project. Issues waiting for you to be tackled in no particular order:
Implemented:
This is a list of useful user scripts, generously created from folks like you to extend this project and its functionality. Make sure to check the respective repository links for detailed license information.
This is your time to shine, read this then open a PR to add your script here.
.info.json
files using ffmpeg
collecting information from downloaded videos.The best donation to Tube Archivist is your time, take a look at the contribution page to get started.
Second best way to support the development is to provide for caffeinated beverages:
This is a selection of places where this project has been featured on reddit, in the news, blogs or any other online media, newest on top.
Big thank you to Digitalocean for generously donating credit for the tubearchivist.com VPS and buildserver.