Open wattsap opened 1 year ago
I have made the following changes based on various general googling to the config for the repo in /data/git/repositories/user/repo.git ( git config )
I guess it doesn't help, right?
prior to the [pack] section of the config, I was seeing Error Signal 9 errors in the gita server logs, which is what made me think it was OOM - after changing the config those messages are gone which is encouraging, but client side the result is still the same during a clone
I haven't tested and have no idea about how to fine tune the config at the moment (sorry), just share some of my thoughts: it seems that git
process itself causes the OOM (otherwise Gitea process would have been killed). Gitea executes the git command to provide the repository content when cloning, if the git process triggers OOM and gets killed, then the client sees a broken connection / protocol. Maybe Gitea also consumes some amount of the memory, so the free memory for git is not as much as before?
I wondered that also, but was running a top on the pod during the clone and it still had plenty of memory:
Mem: 16228700K used, 165240K free, 41992K shrd, 161464K buff, 13025436K cached
CPU: 52% usr 4% sys 0% nic 0% idle 42% io 0% irq 0% sirq
Load average: 1.77 0.68 0.30 4/484 122
PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND
121 120 git R 1513m 9% 1 43% /usr/libexec/git-core/git pack-objects --revs --thin --stdout --progress --delta-base-offset
18 16 git S 853m 5% 0 0% /usr/local/bin/gitea web
120 18 git S 5428 0% 1 0% /usr/bin/git -c protocol.version=2 -c credential.helper= -c filter.lfs.required= -c filter.lfs.smudge= -c filter.lfs.clean= upload-pack --stateless-rpc /data/git/repositories/user/repo.git
17 15 root S 4632 0% 1 0% sshd: /usr/sbin/sshd -D -e [listener] 0 of 10-100 startups
75 68 root S 2596 0% 0 0% bash
111 104 root S 2592 0% 1 0% bash
it doesn't seem like the container OS is running out
I see, can you try to change kernel vm.overcommit_memory=1
?
I didn't set it that way, but it looks like it already is:
bash-5.1# cat /proc/sys/vm/overcommit_memory
1
Description
hello,
I am seeing an interesting issue with cloning a repo from gitea hosted on K3s. I recently migrated from a standalone docker VM as part of a larger migration.
The repo in question is 694 MiB, and after the migration when attempting to clone the repo to another VM outside the K3s cluster I am seeing the below error:
/usr/bin/git clone --origin origin 'https://user:pass@gitea.company.net/user/repo.git' /var/lib/awx/projects/_8__homelab Cloning into '/var/lib/awx/projects/_8__homelab'... remote: Enumerating objects: 4997, done. fatal: the remote end hung up unexpectedly fatal: protocol error: bad pack header
The pods CPU and memory limits are pretty reasonable: `containers:
and when the clone is attempted, I can see it is not trying to cross those thresholds(green line indicates the resource requests, not the resource limits):
Thinking it might be something with the ingress-nginx controller, I tested from other pods in the cluster and got the same result hitting the ip of the clusterip service directly, bypassing the ingress/external routing.
In the gitea pod logs, I see the below during the clone attempt: 2023/04/08 17:06:52 [64319e8c-24] router: slow POST /user/repo.git/git-upload-pack for 10.43.3.3:0, elapsed 3981.0ms @ repo/http.go:492(repo.ServiceUploadPack) 2023/04/08 17:06:59 [64319f33] router: completed GET / for 10.43.1.248:57104, 200 OK in 15.6ms @ web/home.go:33(web.Home) 2023/04/08 17:07:09 [64319f3d] router: completed GET / for 10.43.1.248:33618, 200 OK in 4.8ms @ web/home.go:33(web.Home) 2023/04/08 17:07:19 [64319f47] router: completed GET / for 10.43.1.248:57398, 200 OK in 54.5ms @ web/home.go:33(web.Home) 2023/04/08 17:07:29 [64319f51] router: completed GET / for 10.43.1.248:35500, 200 OK in 4.5ms @ web/home.go:33(web.Home) 2023/04/08 17:07:37 [64319f59] router: completed GET / for 10.43.3.3:0, 200 OK in 32.6ms @ web/home.go:33(web.Home) 2023/04/08 17:07:39 [64319f5b] router: completed GET / for 10.43.1.248:39398, 200 OK in 4.0ms @ web/home.go:33(web.Home) 2023/04/08 17:07:49 [64319f65] router: completed GET / for 10.43.1.248:60854, 200 OK in 4.9ms @ web/home.go:33(web.Home) 2023/04/08 17:07:50 [64319f28-3] router: completed POST /user/repo.git/git-upload-pack for 10.43.3.3:0, 200 OK in 61683.7ms @ repo/http.go:492(repo.ServiceUploadPack)
I have made the following changes based on various general googling to the config for the repo in /data/git/repositories/user/repo.git `bash-5.1# cat config [core] repositoryformatversion = 0 filemode = true bare = true packedGitLimit = 256m
[pack] windowMemory = 100m packSizeLimit = 100m threads = "1"
[http] postBuffer = 200000000 bash-5.1# `
The repo itself appears to be healthy, runing git fsck on the gitea pod for the repo comes back successfully
bash-5.1# git fsck --full Checking object directories: 100% (256/256), done. Checking objects: 100% (3271/3271), done.
i'm not really sure where to go from here from a debugging perspective, i'm afraid I don't know enough about the git protocol in general - is there a tunable I am missing in the gitea config?
Thank you very much for your time.
Gitea Version
1.18.1
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
2.25.1
Operating System
Kubernetes
How are you running Gitea?
bare metal k3s cluster, gitea docker image gitea/gitea:1.18.1, postgres as backend DB. gitea and postgres are running as separate statefulsets, each with PVCs that are made from PVs mounting NFS shares.
Previous working configuration was single VM using docker-compose and the same gitea/postgres containers, mounting the same NFS shares as docker named volumes.
Database
PostgreSQL