Closed DevOps-Youtube-Channel closed 1 day ago
Hi @DevOps-Youtube-Channel
Please attach etcd conf and log file
cat /etc/etcd/etcd.conf
sudo journalctl -u etcd | head -n 50
P.S. Do all VMs have unique hostnames?
yes you are right, all virtual machines have the same hostname: farrukh
One more question here: should I change the IP addresses and so on? In general, in which files should I change the values or something?
yes you are right, all virtual machines have the same hostname: farrukh
This is not the first time. I will add a pre-check so that the deployment does not start if the hostnames are the same.
One more question here: should I change the IP addresses and so on? In general, in which files should I change the values or something?
In inventory you describe the addresses of your servers and which components will be installed on which servers (e.q., etcd on a dedicated server or on database servers)
Try use the UI console if command-line control seems difficult for you. Doc: https://postgresql-cluster.org/docs/deployment/your-own-machines
for some reason the cluster is being created but there is nothing in operations, what is this connected with?
for some reason the cluster is being created but there is nothing in operations, what is this connected with?
Could you check for an entry in the operations table?
docker exec pg-console psql -U postgres -c "select count(*) from operations"
Please attach Console API log
docker exec pg-console cat /var/log/supervisor/pg-console-api-stdout.log | gzip > pg-console-api-stdout.log.gz
{"level":"error","app":"pg_console","version":"2.0.0","module":"docker_manager","cid":"c67b129f-a894-4fd2-b114-3031782192d3","error":"Post \"http://%2Fvar%2Frun%2Fdocker.sock/v1.45/images/create?fromImage=vitabaks%2Fpostgresql_cluster&tag=latest\": context canceled","docker_image":"vitabaks/postgresql_cluster:latest","time":"2024-11-18T07:56:45Z","message":"failed to pull docker image"} {"level":"error","app":"pg_console","version":"2.0.0","cid":"c67b129f-a894-4fd2-b114-3031782192d3","error":"context canceled","time":"2024-11-18T07:56:45Z","message":"failed to update cluster"}
failed to pull docker image
@DevOps-Youtube-Channel Perhaps there are problems with the Internet? The automation image did not load. Try to pull it manually:
docker pull vitabaks/postgresql_cluster:latest
P.S. I downloaded the log and deleted it from comments.
I use it too
yes it was a problem with the image since I deleted it then downloaded it back and everything worked
I just got one failed from 3 hosts
I also can’t find the haproxy and patroni services installed on the hosts
Judging by the log, the cluster deployment failed because the installation of packages was interrupted. Unfortunately, I don’t have the full log for a detailed error analysis, but I suspect there may be issues with repository access for downloading packages. Try to create the cluster again.
P.S. Please attach logs in text format in the future for easier analysis.
I did everything again from scratch, after line 444 I got this error: 192.168.217.135 this is my virtual machine that is running a docker container, most likely my computer simply does not have enough resources (I think RAM) they want for each VM I allocated 16 GB RAM, 4 CPU
Can you tell me which log file I should attach here?
There is an error at the time of creating a 4GB swap file. Do you have enough disk size on your VMs?
the log that you send as a screenshot does not allow us to see the entire text of the error. Just copy the text content of the log.
Everything was installed without failed, only one thing confuses me that I got it in one place, ignore is it critical? I'm attaching the error:
fatal: [192.168.217.134]: FAILED! => {"msg": "Timeout (62s) waiting for privilege escalation prompt: "} ...ignoring
Everything was installed without failed, only one thing confuses me that I got it in one place, ignore is it critical? I'm attaching the error:
fatal: [192.168.217.134]: FAILED! => {"msg": "Timeout (62s) waiting for privilege escalation prompt: "} ...ignoring
Everything was installed without failed, only one thing confuses me that I got it in one place, ignore is it critical?
The error occurred because Ansible timed out while waiting for a privilege escalation prompt on the host 192.168.217.134. However, the playbook is configured to ignore non-critical errors like this one (...ignoring) and continued the deployment.
The automation code is designed in such a way that any error critical to the functioning of the cluster will not be ignored. If the deployment completed successfully, then everything is fine.
Thanks a lot of
Hello During the installation of the etcd server I receive an error, I am attaching screenshots. Why does this happen?
VM-1-2-3 character: 16 GB RAM, 4 CPU, 20 GB SSD