Closed LAMaglan closed 3 months ago
I do not know anything about podman. The volumes are listed at the end of the docker-compose.yml file here: https://github.com/digitalmethodsinitiative/4cat/blob/3ec9c6ea471bcdbe9fb1caad1e5fe1502a705444/docker-compose.yml#L56 The ${DOCKER_DB_VOL}
references whatever is set to that variable in the .env
file. Podman appears to be correctly getting their names so it is likely reading it. Perhaps podman compose cannot create the volumes automatically and you need to create them yourself. I'd start looking at differences between podman and docker in reference to volumes.
If you look at the volumes section of each container in the docker-compose.yml file, it specifies where in the container data will be stored by 4CAT:
volumes:
- 4cat_data:/usr/src/app/data/
- 4cat_config:/usr/src/app/config/
- 4cat_logs:/usr/src/app/logs/
In the container, your datasets will be stored in /usr/src/app/data/
, config files in /usr/src/app/config/
, etc. You can put those directories in mounts, volumes, or even not use volumes leaving the files in the container itself (in Docker, we use volumes as those are kept even if a container is deleted so your data will persist unless you specifically delete the volumes).
You can let others know how you get 4CAT to work in this issue or if you cannot, but we do not directly support podman.
With some tweaks to the docker-compose.yml file, I manage to get the three containers running with podman-compose .
However, when inspecting the container logs, it looks like the frontend is waiting for the backend, and the
backend container log shows
Invalid port #
To clarify, I have not edited anything in the .env
file. I have also ensured that the server I am running this on allows incoming and outgoing traffic from all the ports in the .env
. I have also tested and esnured communication between the three containers running ping
from within to each other (if that makes sense).
Any pointers as to what I should try next would be greatly appreciated (I fully understand that 4cat does not directly support podman or podman-compose)
Could you provide context around the Invalid port #
? I have not seen that specifically and am not sure what line of code is throwing that error. It could just be that something is already using one of the ports in the .env
file. 80
might already be in use by something else on your server for example.
Unfortunately, Invalid port #
is the only message (repeated) when inspecting the log of the 4cat_backend
container.
I can inspect which ports allow incoming/outgoing traffic (are open) with
firewall-cmd --list-all
:
public (active)
target: default
icmp-block-inversion: no
interfaces: ens192
sources:
services: cockpit dhcpv6-client ssh
ports: 3838/tcp 3838/udp 80/tcp 4444/tcp 5432/tcp 5000/tcp 443/tcp 150/tcp 80/udp 4444/udp 5432/udp 5000/udp 150/udp
protocols:
forward: no
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
(I have manually enabled the ports you see above, except for 3838). None of the ports (including 80) are being used in other processes. I tried switching PUBLIC_PORT in the .env file to a random port (150), but with the same error.
FYI: I am running on a Redhat distro (RHEL 8.9)
It is unclear from that what is not receiving a correct port. You can try checking the actual logs in the containers. They are located in /usr/src/app/logs/
. 4cat.stderr
ought to tell you if and why the backend failed to start. The frontend will try to connect the the backend's API which is running on 4444. Docker creates a network that all the containers use to talk to one and other; if the backend is indeed running, then there is some issue where the frontend cannot connect to the backend's api port.
That is to say: it is not important that the ports are exposed on your host machine (though the public port defaulting to 80 needs to be if anyone is to see the frontend webpage outside the host machine), but that the network podman creates has those ports available to connect to amongst the containers.
This is the code that is failing in the frontend:
from common.lib.helpers import call_api
call_api("worker-status")["response"]["running"]
It may tell you more specifically what is failing if you run it directly in the frontend. You can also check to see if you are actually passing the correct variables as API_HOST
and PUBLIC_API_PORT
:
from common.config_manager import config
print(config.get('API_HOST'))
print(config.get('API_PORT'))
Docker uses the .env
file and passes those variables to the OS environment of the containers. Perhaps podman does not. (e.g. echo "$API_HOST"
prints backend
) https://github.com/digitalmethodsinitiative/4cat/blob/59cb19a3c88f7f4a4ac02d0b7a891afde50ea069/docker-compose.yml#L25
Docker also uses the container names as host names in the network it sets up. API_HOST
is set to backend
because the container is listed as backend
in the docker-compose.yml file.
It is unclear from that what is not receiving a correct port. You can try checking the actual logs in the containers. They are located in
/usr/src/app/logs/
.4cat.stderr
ought to tell you if and why the backend failed to start. The frontend will try to connect the the backend's API which is running on 4444. Docker creates a network that all the containers use to talk to one and other; if the backend is indeed running, then there is some issue where the frontend cannot connect to the backend's api port.That is to say: it is not important that the ports are exposed on your host machine (though the public port defaulting to 80 needs to be if anyone is to see the frontend webpage outside the host machine), but that the network podman creates has those ports available to connect to amongst the containers.
The /usr/src/app/logs/
in the frontend and backend containers seem to be empty
This is the code that is failing in the frontend:
from common.lib.helpers import call_api call_api("worker-status")["response"]["running"]
It may tell you more specifically what is failing if you run it directly in the frontend. You can also check to see if you are actually passing the correct variables as
API_HOST
andPUBLIC_API_PORT
:from common.config_manager import config print(config.get('API_HOST')) print(config.get('API_PORT'))
Docker uses the
.env
file and passes those variables to the OS environment of the containers. Perhaps podman does not. (e.g.echo "$API_HOST"
printsbackend
)Docker also uses the container names as host names in the network it sets up.
API_HOST
is set tobackend
because the container is listed asbackend
in the docker-compose.yml file.
When I try to run your python code in the frontend container, I get this error:
>>> from common.lib.helpers import call_api
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/src/app/common/lib/helpers.py", line 25, in <module>
from common.config_manager import config
File "/usr/src/app/common/config_manager.py", line 579, in <module>
config = ConfigManager()
File "/usr/src/app/common/config_manager.py", line 28, in __init__
self.load_core_settings()
File "/usr/src/app/common/config_manager.py", line 109, in load_core_settings
raise ConfigException("No config/config.ini file exists! Update and rename the config.ini-example file.")
common.lib.exceptions.ConfigException: No config/config.ini file exists! Update and rename the config.ini-example file.
When I rename config.ini-example to config.ini , it runs (maybe that should've automatically happened?)
When running the call_api function, I get an error:
>>> from common.lib.helpers import call_api
>>> call_api("worker-status")["response"]["running"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/src/app/common/lib/helpers.py", line 371, in call_api
connection.connect((config.get('API_HOST'), config.get('API_PORT')))
ConnectionRefusedError: [Errno 111] Connection refused
I can verify that the .env variables are being passed to the OS environments of the containers, e.g. by running your python code (in the frontend container).
>>> from common.config_manager import config
>>> print(config.get('API_HOST'))
localhost
>>> print(config.get('API_PORT'))
4444
>>>
Ok, so because your logs are empty and no 4cat.stderr exists
, we know that 4CAT itself never attempts to start.
You are correct, the config.ini.example file would normally be created by the backend when it runs docker_setup.py (which is clearly not happening). Because you copied it yourself, we actually do not know if the .env
variables are being passed to the OS. Basically the .env
variables are passed via Docker to the OS, then docker_setup.py
uses them to create a config.ini
which is then used by 4CAT. You could try echo $POSTGRES_HOST
or any of the .env
variables in the OS to see if they are passed. That would have been a better check had I thought of it earlier.
You did not mention this, but does the backend log start with Waiting for postgres...
? If not, then it probably has nothing to do with the environment variables and instead podman is failing to ever build or start the container. If it does have that, then I am guessing that the backend container starts and gets stuck waiting to connect to the database (I attempted to get the same Invalid port #
you mentioned, but was unable to so I did not originally think this was the problem) here: https://github.com/digitalmethodsinitiative/4cat/blob/50a4434a37d71af6a9470c7fc4a236b043cbfb4d/docker/docker-entrypoint.sh#L27 This could be that the environment variables are not passed properly or that podman does not properly set up the network for the containers to connect to one and other.
You are correct, the config.ini.example file would normally be created by the backend when it runs docker_setup.py (which is clearly not happening). Because you copied it yourself, we actually do not know if the .env variables are being passed to the OS. Basically the .env variables are passed via Docker to the OS, then docker_setup.py uses them to create a config.ini which is then used by 4CAT. You could try echo $POSTGRES_HOST or any of the .env variables in the OS to see if they are passed. That would have been a better check had I thought of it earlier.
The environment variables from .env
do seem to get passed.
Here from the backend container:
[luigiam-drift@itfds-shinyutv 4cat_first_try]$ podman exec -it 4cat_backend bash
root@0c3527fd7af0:/usr/src/app# echo $POSTGRES_HOST
db
root@0c3527fd7af0:/usr/src/app# echo $POSTGRES_DB
fourcat
You did not mention this, but does the backend log start with Waiting for postgres...? If not, then it probably has nothing to do with the environment variables and instead podman is failing to ever build or start the container. If it does have that, then I am guessing that the backend container starts and gets stuck waiting to connect to the database (I attempted to get the same Invalid port # you mentioned, but was unable to so I did not originally think this was the problem) here:
Actually, it does show Waiting for postgres...
(see below)
Waiting for postgres...
invalid port #
invalid port #
invalid port #
You might be right that there is something wrong with the podman network rather than the environment variables, although I can ping each container (i.e. ping 4cat_backend from 4cat_db, and so on).
How do you know they can connect exactly? Because Docker set host names for the containers as db
, backend
, and frontend
per their keys in the docker-compose.yml
instead of the container names (no idea why, but hey beggars cannot be choosers). If their host names in podman are actually 4cat_db
, 4cat_backend
, and 4cat_frontend
then that may be the issue!
Now we're cooking with gas! It looks like that very well may be the issue.
If you update the .env
and set POSTGRES_HOST=4cat_db
and API_HOST=4cat_backend
and then rebuild the containers we might fix it. You'll probably need to delete the config.ini
file you created as docker_setup.py
will see it and try updating as opposed to creating.
I have not played with changing those variables so you may run into some other related issue, but I think everything will work fine. That's why we made them variables after all.
How do you know they can connect exactly?
I am running ping based on the container names, e.g. from the 4cat_db container, I run:
ping 4cat_backend
Because Docker set host names for the containers as db, backend, and frontend per their keys in the docker-compose.yml instead of the container names (no idea why, but hey beggars cannot be choosers). If their host names in podman are actually 4cat_db, 4cat_backend, and 4cat_frontend then that may be the issue!
After some troubleshooting and "research", it seems like the default behaviour of podman is to assign the container IDs as the hostnames:
[luigiam-drift@itfds-shinyutv 4cat_first_try]$ podman inspect -f '{{.Config.Hostname}}' 4cat_db
db6d6a226b44
[luigiam-drift@itfds-shinyutv 4cat_first_try]$ podman inspect -f '{{.Config.Hostname}}' 4cat_backend
18277dba4612
[luigiam-drift@itfds-shinyutv 4cat_first_try]$ podman inspect -f '{{.Config.Hostname}}' 4cat_frontend
910a2bf4d396
I will try with your suggestion first (adjusting .env
). If that does not work, I will try to figure out if I can set the hostnames before running podman-compose. Will keep you updated. Thanks for the pointers so far.
Let us know how it goes!
After trying out your adjustments, and what others have suggested, I still run into the same problem.
Looking a bit more closely at what exactly gets run in 4CAT, I think it is not compatible with podman
.
Since it podman
is daemonless (unlike docker), running https://github.com/digitalmethodsinitiative/4cat/blob/master/4cat-daemon.py does not work properly (which gets used in the backend?).
Is this part of the codebase (i.e. 4cat-daemon.py
) essential for running 4CAT through containers? If it is, then most likely I will have to abandon trying to get it to run with podman
Have you checked your logs since the changes? The 4cat logs?
Yes, that “is” 4CAT but technically the last while loop runs continuously until receiving a SIGTERM/kill command, keeping the container from completing so the daemon less ought not come into play here as I understand it.
The podman logs show the same invalid port #
for the 4cat_backend container, when running podman-compose by doing any of the following:
a) updating the .env
file to use the hostnames that podman creates
b) explicitly setting the hostnames in the docker-compose.yml files to match what is in .env
(this is a podman specific thing)
c) Setting --net=host
in the docker-compose.yml, and adjusting the .env
accordingly (was wondering if there was an issue with the container network)
FYI: I have separately managed to get 4CAT to run locally on my computer (ubuntu 22.04) using docker
and docker-compose
(not podman or podman-compose). Also, have just recently managed to get it running on the "server" (redhat 8.9) with manual installation (i.e. not containers).
I will put this on hold for now, but will update it if I manage to get further
For full clarity, here is what I ended up with trying with podman(-compose):
docker-compose.yml
version: '3.6'
services:
db:
container_name: 4cat_db
image: postgres:${POSTGRES_TAG}
restart: unless-stopped
hostname: db
# Note: if network_mode host, need to adjust `.env` (relevant variables set to "localhost")
# network_mode: host
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
- POSTGRES_HOST_AUTH_METHOD=${POSTGRES_HOST_AUTH_METHOD}
volumes:
- 4cat_db:/var/lib/postgresql/data/
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U $${POSTGRES_USER}" ]
interval: 5s
timeout: 5s
retries: 5
backend:
image: digitalmethodsinitiative/4cat:${DOCKER_TAG}
container_name: 4cat_backend
restart: unless-stopped
hostname: backend
# Note: if network_mode host, need to adjust `.env` (relevant variables set to "localhost")
# network_mode: host
env_file:
- .env
depends_on:
db:
condition: service_healthy
ports:
- ${PUBLIC_API_PORT}:4444
volumes:
- 4cat_data:/usr/src/app/data/
- 4cat_config:/usr/src/app/config/
- 4cat_logs:/usr/src/app/logs/
entrypoint: ['docker/docker-entrypoint.sh']
frontend:
image: digitalmethodsinitiative/4cat:${DOCKER_TAG}
container_name: 4cat_frontend
restart: unless-stopped
hostname: frontend
# Note: if network_mode host, need to adjust `.env` (relevant variables set to "localhost")
# network_mode: host
env_file:
- .env
depends_on:
- db
- backend
ports:
- ${PUBLIC_PORT}:5000
- ${TELEGRAM_PORT}:443
volumes:
- 4cat_data:/usr/src/app/data/
- 4cat_config:/usr/src/app/config/
- 4cat_logs:/usr/src/app/logs/
command: ['docker/wait-for-backend.sh']
volumes:
4cat_db:
name: ${DOCKER_DB_VOL}
# Note: additional path necessary for podman, but needs to be removed for docker
path: /var/lib/postgresql/data/
4cat_data:
name: ${DOCKER_DATA_VOL}
# Note: additional path necessary for podman, but needs to be removed for docker
path: /usr/src/app/data/ #podman
4cat_config:
name: ${DOCKER_CONFIG_VOL}
# Note: additional path necessary for podman, but needs to be removed for docker
path: /usr/src/app/config/
4cat_logs:
name: ${DOCKER_LOGS_VOL}
# Note: additional path necessary for podman, but needs to be removed for docker
path: /usr/src/app/logs/ #podman
The main changes wrt. to the original docker-compose.yml are:
Additionally, in the .env
file, had to change any port numbers that were lower than 1024 to something than higher than 1024 (i.e. PUBLIC_PORT
and TELEGRAM_PORT
), otherwise, podman complains about "permission" issues (could alternatively ensure these ports ensure traffic on the host that podman is running on).
This modified docker-compose.yml
and .env
file seems to work with docker (although have to get rid of the extra "paths" in the volume section), just that you have to specify the port number when checking out the frontend, e.g.
localhost:<port number from .env>
I think the issue with podman is that it doesn't work well with the 4cat-daemon.py, or it could be an issue with communication between containers, but really not sure (see thread above for more details)
I finally got this working. I will update this with full details at some point.
TL;DR: With a colleague who is much more experienced with podman, we found out that part of the issue was 1) DNS
and
2) it is better to use podman with docker-compose rather than podman-compose
Only setting port numbers higher than 1024 is required if running podman as rootless (otherwise, if runnin in root, no changes need to made).
Here are the general steps (specific commands are from redhat).
1) Install podman-plugins for DNS resoltuion
dnf install podman-plugins
2) start podman.service
sudo systemctl start podman.service
3) Install package podman-docker
to provide compatibility layer between Docker and podman
sudo yum install podman-docker
4) install newest version of docker-compose
curl -L "https://github.com/docker/compose/releases/download/v[version}/docker-compose-linux-x86_64" -o docker-compose
sudo mv docker-compose /usr/local/bin && sudo chmod +x /usr/local/bin/docker-compose
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
5) Sort out permission for docker-compose
chmod +x docker-compose-linux-x86_64
6) Run dock-ercompose as sudo
Some additional "security" steps can be made with podman, such as creating a user (e.g. 4cat) that runs the containers (would then need to map UID to GID in docker-compose.yml)
Thanks for sharing this with us! Closing the issue now, as we currently do not intend to support Podman more directly, but with these notes people who want to use it nevertheless can hopefully do so successfully.
Describe the bug I am trying to install 4CAT. Due to policy at my work institution, we have to use podman-compose (i.e. podman) instead of docker/docker-compose. The 4CAT image is pushed to our institution image registry (harbor.uio.no) I ran
podman-compose up -d
(in the directory containing thedocker-compose.yml
and.env
after doing agit clone
The containers are up and running, from
podman ps
I seeHowever, there were some errors that are shown in the podman-compose logs, specifially
I can see that these are mentioned in the
.env
but cannot see them in thedocker-compose.yml
file. Is there something else I should have done when running podman-compose? Note: I get the same error when running thepodman-compose
command withsudo
.See below for the full compose log
Screenshots, links to datasets, and any additional context Full podman-compose log