Steam downloads stuck at 99%

Caennanu commented 11 months ago

Describe the issue you are having

When using LanCache (in between client and PiHole) most steam downloads get stuck at 99% (if its a small update, it will get stuck at a random %). DNS'ing directly to pihole will finish the downloads.

How are you running the container(s)

Monolitic:
/usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='Lancache' --net='br0.30' --ip='192.168.30.202' -e TZ="Europe/Berlin" -e HOST_OS="Unraid" -e HOST_HOSTNAME="Bigboii" -e HOST_CONTAINERNAME="Lancache" -e 'TCP_PORT_80'='80' -e 'TCP_PORT_443'='443' -l net.unraid.docker.managed=dockerman -l net.unraid.docker.icon='https://raw.githubusercontent.com/redvex2460/docker-templates/master/redvex2460/images/lancache.png' -v '/mnt/user/Lancache2/logs':'/data/logs':'rw' -v '/mnt/user/Lancache2/cache':'/data/cache/':'rw' --memory=8G 'lancachenet/monolithic:latest'

DNS Configuration

/usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker create --name='Lancache-DNS' --net='br0.30' --ip='192.168.30.200' -e TZ="Europe/Berlin" -e HOST_OS="Unraid" -e HOST_HOSTNAME="Bigboii" -e HOST_CONTAINERNAME="Lancache-DNS" -e 'UDP_PORT_53'='53' -e 'USE_GENERIC_CACHE'='true' -e 'LANCACHE_IP'='192.168.30.202' -e 'UPSTREAM_DNS'='192.168.120.201' -e 'CACHE_MEM_SIZE'='500m' -e 'CACHE_MAX_AGE'='14d' -l net.unraid.docker.managed=dockerman -l net.unraid.docker.icon='https://raw.githubusercontent.com/redvex2460/docker-templates/master/redvex2460/images/lancache.png' --memory=8G 'lancachenet/lancache-dns:latest'

Output of container(s)

Not quite sure what to paste here.

Caennanu commented 11 months ago

To add some flavor to the issue: With some games, the download will keep active for the last few mb's it tries to download and it will hog full bandwidth (1gb isp) for an indefinite amount of time. On both client and the server (10GbE home network so no issues on other clients, aside of throttling internet). Almost like its doing a file check continuously.

Changing upstream DNS from pihole to external doesn't work. Enabling google DNS via Pihole doesn't work and in all cases it always is the last bit of the download it refuses to complete.

I've upped the memory from 4gb to 8gb its allowed to use disabled nginxproxymanager and duckdns

Caennanu commented 11 months ago

Should it happen to anyone else finding this. Clearing the cache (basically deleting all subfolders of the cache share) will also help.

Caennanu commented 11 months ago

Since the issue keeps happening. i'm interested in a fix. Hopefully others have the same issue.

If it turns out to be Unraid specific, i would like to know this too.

sfinke0 commented 11 months ago

Hi @Caennanu. We had this happen at a LAN party as well but could not quite figure it out. Games were stuck at 99%. The requested files were corrupt (we did check via download with and without lancache - md5sums did not match) therefore clients were requesting the corrupt files over and over again which resulted in almost 25Gbit/s traffic to the lancache.

I always had the feeling it might have something to do with the NGINX version or maybe the filesystem (ZFS). Which kind of filesystem are you running on you machine? How much storage space have you dedicated towards the lancache? Are you using the full CACHE_DISK_SIZE space so that NGINX removes old files again?

Other than that I could not reproduce this issue on my lancache at home so I am at a loss how to provide a proper description of the problem.

Caennanu commented 11 months ago

@sfinke0 Good to hear i'm not the only one it is happening to.

And it sounds very plausible that somehow the files get corrupted causing an endless download spiral. This is also in line with it being fixed after removing the folder. But its happening more and more often, which is troublesome.

I'm using btrfs for the unraid array its running on (or rather its cache, as the file share is set to cache preference, something about HDD speeds) In theory lancache can use the full cache, which is 1tb of NVME storage, but due to high water principle it can also use the slower hdd space, being about 9TB free at the time of writing this.

No, my shares are not full or rarely are. And have not used that argument. CACHE_MAX_AGE is 5d(days) and CACHE_MEM_SIZE is the default 500mb.

Caennanu commented 11 months ago

Right so .. update. i've disabled the use off Lancache as it was corrupting every other update and became unmanageable for me.

sfinke0 commented 11 months ago

Hi @Caennanu,

could you tell me more about your system?

OS (cat /etc/os-release)
Kernel version (uname -a)
what kind of NVMe SSD are you running? Which vendor?
what kind of filesystem are you running? ZFS? BTRFS?
show the lancache environment variables settings you defined
how much storage / games do you have cached until the error occurs?
any information that could help narrow down the problem 😝

cheers

Caennanu commented 11 months ago

Hello @sfinke0, ill try!

OS: Unraid 6.12.2, which runs docker V20.10.24 Output is:
NAME=Slackware VERSION="15.0" ID=slackware VERSION_ID=15.0 PRETTY_NAME="Slackware 15.0 x86_64 (post 15.0 -current)" ANSI_COLOR="0;34" CPE_NAME="cpe:/o:slackware:slackware_linux:15.0" HOME_URL="http://slackware.com/" SUPPORT_URL="http://www.linuxquestions.org/questions/slackware-14/" BUG_REPORT_URL="http://www.linuxquestions.org/questions/slackware-14/" VERSION_CODENAME=current
uname output: Linux Bigboii 6.1.36-Unraid
2x 500gb kingston SA2000 NVME in a pool for 1TB of cache storage.
Filesystem: BTRFS
monolitic docker has no variables other than limiting (RAM) memory to 8GB.
DNS has: USE_GENERIC_CACHE set to true, upstream dns (pihole ip), CACHE_MEM_SIZE = 500m, CACHE_MAX_AGE = 5d
last time i cleaned the cache, i couldn't download a 15mb steam workshop item. Necesse had the issue from the start which is an 300mb install total? before stopping the service, i had 2,5gb of files in there and it hung on downloading an update for i believe smalland

XML for monolitic template (anonimized)

``` Lancache lancachenet/monolithic:latest https://github.com/lancachenet/monolithic private private sh false http://lancache.net This docker container provides a caching proxy server for game download content. For any network with more than one PC gamer in connected this will drastically reduce internet bandwidth consumption. Downloaders: Other: https://raw.githubusercontent.com/redvex2460/docker-templates/master/redvex2460/lancache.xml https://raw.githubusercontent.com/redvex2460/docker-templates/master/redvex2460/images/lancache.png --memory=8G 1691438038 If you like my work please consider Donating. https://paypal.me/RedVex2460Gaming 80 443 /mnt/user/Lancache2/logs /mnt/user/Lancache2/cache ```

XML for DNS template

``` Lancache-DNS lancachenet/lancache-dns:latest https://github.com/lancachenet/lancache-dns private private sh false http://lancache.net This docker container provides DNS entries for caching services to be used in conjunction with a container. The DNS is generated automatically at startup of the container Network:DNS Network:Proxy https://raw.githubusercontent.com/redvex2460/docker-templates/master/redvex2460/lancache-dns.xml https://raw.githubusercontent.com/redvex2460/docker-templates/master/redvex2460/images/lancache.png --memory=8G 1691438104 If you like my work please consider Donating. https://paypal.me/RedVex2460Gaming 53 true private private 500m 5d ```

GAOTAO-1 commented 10 months ago

@sfinke0 hi ,I have the same problem

OS：NAME="CentOS Linux" VERSION="7 (Core)"
uname output: Linux localhost.localdomain 3.10.0-1160.el7.x86_64 SMP Mon Oct 19 16:18:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Disk:2T HDD
Filesystem: xfs
envirment: USE_GENERIC_CACHE=true LANCACHE_IP=192.168.20.10 DNS_BIND_IP=192.168.20.10 UPSTREAM_DNS=192.168.28.2 CACHE_ROOT=./lanchache CACHE_DISK_SIZE=1800g CACHE_INDEX_SIZE=500m CACHE_MAX_AGE=7d
have 238G cached until the error occurs

I was downloading DOTA2, which occasionally happens in other games, and when the problem occurred I checked the access.log and found that I kept requesting part of the data repeatedly, I modified the log to this format： steam/depot/228990/chunk/9 d035a59bd9f9af517f0d1c666e6d742cadf6fe9bytes = 0-1048575, through the md5 lookup to the Corresponding files and delete them, The re-download was still the wrong file, I tried to restart the container, but nothing worked

Looking forward to your reply, thank you

GAOTAO-1 commented 10 months ago

I used the same configuration to run on another virtual machine, except that the hard drive changed to an SSD, and the same problem occurred when I downloaded the first game PUBG

de-conf commented 9 months ago

Try turning off the slicing configured by container ningx, which may solve the problem. The cause of the problem is that the steam cdn has already been sliced. When slicing again, the nginx of the cdn causes the local slice to be too small and the file size to be inconsistent.

#File: /hooks/entrypoint-pre.d/10_setup.sh

# close silce
sed -i '/slice/ s/^/#/' /etc/nginx/sites-available/cache.conf.d/root/20_cache.conf
sed -i 's/\$slice_range//' /etc/nginx/sites-available/cache.conf.d/root/30_cache_key.conf

KirkPan commented 8 months ago

尝试关闭容器ningx配置的切片，或许可以解决问题。问题原因是steam cdn已经被切片了。再次切片时，cdn的nginx导致本地切片太小，文件大小不一致。
#File: /hooks/entrypoint-pre.d/10_setup.sh

# close silce
sed -i '/slice/ s/^/#/' /etc/nginx/sites-available/cache.conf.d/root/20_cache.conf
sed -i 's/\$slice_range//' /etc/nginx/sites-available/cache.conf.d/root/30_cache_key.conf

Hey, man. I've modified the Nginx slices inside the container as you gave me, how can I go about determining this from the Lancache server side, other than verifying that the progress bar doesn't STUCK at 99% or xx% from the client side?

And I'm curious to know what caused this Stuck?

Respect!!

de-conf commented 7 months ago

reason: Left: lancache makes a slicing request Right: normal client situation

result: Left: Data packets received by lancache Right: Packet received by client

in conclusion:

Some CDNs responded to Lancache's slicing request, but the cache file was larger than the slicing, causing the file to be downloaded incompletely, resulting in a loop of downloading damaged slicing files.

suggestion:

Turn off slicing configuration and follow cdn slicing configuration.

KirkPan commented 6 months ago

Try turning off the slicing configured by container ningx, which may solve the problem. The cause of the problem is that the steam cdn has already been sliced. When slicing again, the nginx of the cdn causes the local slice to be too small and the file size to be inconsistent.
#File: /hooks/entrypoint-pre.d/10_setup.sh

# close silce
sed -i '/slice/ s/^/#/' /etc/nginx/sites-available/cache.conf.d/root/20_cache.conf
sed -i 's/\$slice_range//' /etc/nginx/sites-available/cache.conf.d/root/30_cache_key.conf

After closing the slice through this method, the issue of STUCK 99% or xx% during the download process will still occur.

Pls how to solve it?

dwhmofly commented 6 months ago

Try turning off the slicing configured by container ningx, which may solve the problem. The cause of the problem is that the steam cdn has already been sliced. When slicing again, the nginx of the cdn causes the local slice to be too small and the file size to be inconsistent.
#File: /hooks/entrypoint-pre.d/10_setup.sh

# close silce
sed -i '/slice/ s/^/#/' /etc/nginx/sites-available/cache.conf.d/root/20_cache.conf
sed -i 's/\$slice_range//' /etc/nginx/sites-available/cache.conf.d/root/30_cache_key.conf

thanks lts useful

dwhmofly commented 6 months ago

Try turning off the slicing configured by container ningx, which may solve the problem. The cause of the problem is that the steam cdn has already been sliced. When slicing again, the nginx of the cdn causes the local slice to be too small and the file size to be inconsistent.
#File: /hooks/entrypoint-pre.d/10_setup.sh

# close silce
sed -i '/slice/ s/^/#/' /etc/nginx/sites-available/cache.conf.d/root/20_cache.conf
sed -i 's/\$slice_range//' /etc/nginx/sites-available/cache.conf.d/root/30_cache_key.conf
After closing the slice through this method, the issue of STUCK 99% or xx% during the download process will still occur.

Pls how to solve it?

maybe you dont clear the wrong cache and dont reload nginx

de-conf commented 6 months ago

Better approach:

# File:  .env
#set environment variables
CACHE_SLICE_SIZE=0

KirkPan commented 5 months ago

Better approach:

# File:  .env
#set environment variables
CACHE_SLICE_SIZE=0

What happens if CACHE_SLICE_SIZE=0 is set to 0?

Caennanu commented 5 months ago

Better approach:
# File:  .env
#set environment variables
CACHE_SLICE_SIZE=0
What happens if CACHE_SLICE_SIZE=0 is set to 0?

From what i read on the lancache.net this turns off nginx slicing. This was at least something i could configure in my docker template, and has fixed my issue (thus far).

lancachenet / monolithic