Closed amir-bialek closed 4 months ago
Hi @amir-bialek You mentioned several questions in this issue, if I didn't cover all of them, please let me know.
Disabling PostgreSQL - You should use PostgreSQL when running Artifactory on k8s. We are not supporting k8s deployments With Derby database (the default database configured)
autoscaling - You should not use it as it will work only when you have a valid Enterprise/Enterprise Plus license which support High Availability deployments.
if you decided to use the ingress method and disable Nginx, you should install nginx-ingress controller in your cluster. https://jfrog.com/help/r/jfrog-installation-setup-documentation/run-ingress-behind-another-load-balancer
Hey @gitta-jfrog
Thank you for the reply.
Understood, thank you. I am deploying now a new artifactory with the new helm + PostgreSQL, and will try to import the data from the 'live' artifactory into 'new' artifactory, then will make the switch (system backup and restore). Can you advice why the default PostgreSQL pvc is 200GB, but the artifactory pvc is only 20GB? shouldn't it be the opposite?
Understood, thank you.
Nginx ingress controller is install on the cluster and the ingress to conan svc is working well. Note that the specific call is happening without the ingress -> it is from another svc in k8s so it is calling conan directly with artifactory.default.svc.cluster.local:8082.
Indeed the defaults here should be tuned. You can change it according to your needs. Assuming you are storing your binaries on the PVC itself, the Artifactory Filestore will be definitely bigger than the DB size.
I understand. So your client is reaching Artifactory SVC directly. I think the 503 errors you are seeing might be related to the resources allocated to Artifactory Service. What is the size of the node running Artifactory pod? can you see pod restarts? Anything in artifactory-service.log (/opt/jfrog/artifactory/var/log) that indicate resource of crashing of the JVM? How many incoming requests in parallel you are running?
Hey, Artifactory is running on worker1 , it have plenty of resources. kubectl top node
give me:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
master 828m 6% 5952Mi 37%
worker1 2286m 6% 17547Mi 30%
There is no pod-restarts or any special errors, other then:
2024/07/07 11:47:04 httputil: ReverseProxy read error during body copy: stream error: stream ID 673843; CANCEL; received from peer
Which I do see a lot.
Looking at Grafana dashboard for pod resources, the CPU and memory are steady, I do not see any jump in the past 5 days. At the moment the pod have no resources requests and limits (default settings). I did try to add:
artifactory:
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "6Gi"
cpu: "1"
javaOpts:
xms: "1g"
xmx: "5g"
And I verified in the logs that it received the new Xmx, but it still reproduce the 503.
I do see a jump in Bandwidth and Packets:
Hey, after uploading the new Artifactory with the following:
artifactory:
nginx:
enabled: false
ingress:
enabled: true
className: "my-class"
hosts:
- my-host1
- my-host2
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "0"
tls:
- secretName: dev-cert
hosts:
- my-host1
- my-host2
nameOverride: artifactory
fullnameOverride: artifactory
artifactory:
persistence:
accessMode: ReadWriteOnce
size: 200Gi
postgresql:
persistence:
enabled: true
size: 20Gi
I do not see the error. but please leave this case open for another 2-3 days so that I can verify the problem is solved by using postgresql
Great I'm glad you managed to move to PostgreSQL. The use of Artifactory with PostgreSQL will allow you multiple connections to the DB (compare to single connection allowed when using Derby) and that should improve the system behavior. I'll keep this open for the next few days.
Thanks
Is this a request for help?:
Version of Helm and Kubernetes: 1.29
Which chart: artifactory-cpp-ce 107.77.8
Which product license (Enterprise/Pro/oss): Community
JFrog support reference (if already raised with support team):
What happened: Deployed artifactory-cpp-ce helm on on-prem k8s with the following values:
services are accessing conan directly with:
artifactory.default.svc.cluster.local
And I am getting this error too often:
Recently it is happening too many times to ignore On the logs I see:
I can try to add:
(by default it is set to false)
And / or add postgresql:
And / or add nginx:
And / or update helm to version 107.84.16 , currently it is on 107.77.8
The thing is, all the software team are using this db, so every change will block them..
Also can someone advice what is the use of postgresql in this chart ?
If anyone can help, will appreciate it.