bntjah / lancache

Improve download speeds and reduce strain on your Internet connection at LAN parties. Locally cache game installs and updates from the largest distributors: Steam, RIOT, Blizard, Hirez, Origin, Sony, Microsoft, Tera, GOG, ArenaNetworks, WarGaming, and Uplay. Super easy to setup with auto installer script!
174 stars 44 forks source link

Super slow speed for uncached games #145

Open dark-swordsman opened 5 years ago

dark-swordsman commented 5 years ago

Hello again,

So we have our lancache working now via the lancache installer from nexus, but suddenly new games that aren't currently cached are taking forever to download. We tested without lancache, and the games can still download at nearly 80 MB/s on our 1 Gig internet connection. However, lancache speeds are anywhere from 2-8 MB/s.

When I first tested it, it worked great, but now that more than 1 PC has ran on it, they all are slow. Cached games still can download at 100-115 MB/s no problem, just the uncached ones are slow.

I remember reading somewhere about either the network interfaces or nginx setup has to be modified, but I can't find where I saw that.

What should I test and try?

nexusofdoom commented 5 years ago

from what I did see you have to add more ip's for steam to use. I do not have the link talking about it. for my setup I am getting around 80mbps - 100mbps when downloading new games from steam.

nexusofdoom commented 5 years ago

ok I think this is what you are looking for https://github.com/steamcache/steamcache/issues/44

nexusofdoom commented 5 years ago

https://github.com/steamcache/steamcache-dns/pull/69

dark-swordsman commented 5 years ago

That's interesting...

I should note, it's not just steam that's experiencing slow speeds. Origin and Battlenet from our testing are also very slow, down near 4-8 MB/s, and I don't really know how I feel about adding over 30 more IPs at least. We can do it, just seems unnecessary.

Could it also be something in the Nginx config for proxying data?

dark-swordsman commented 5 years ago

I was able to test a little bit, and I see what you mean about adding connections.

I downloaded a game with the DHCP handing out the following for DNS:

1: 10.1.0.110
2: 8.8.8.8
3: 8.8.4.4

The first download downloaded through the cache at about 40-60 MB/s.

Stopped, started a new game, same speed: 40-60 MB/s.

Had someone else download a game and I started yet another game, but now neither of us are going through the lancache. This tells me the lancache refused a connection because it doesn't have any avaialble (i.e: maxed out at 2/3 connections).

So now I went over to battle.net and am installing HoS, and it's maxing at about 6-7 MB/s.

So, a few things I guess:

I see the benefit of adding IPs for the virtual interfaces, but would we also need to add extra lancache IPs that it can proxy through?

Is there anyway to tell Nginx to allow more data through at once?

I believe that even though the connections stop being used, it doesn't appear that nginx/lancache drop the old connection quick enough when they are done being used. Is there a way to decrease this time?

nexusofdoom commented 5 years ago

Origin is working for me image

nexusofdoom commented 5 years ago

For your DNS to clients they can not use 2 or 3 it will bypass the lancache. you can only use 1: not 2: or 3: just 1:

1: 10.1.0.110 2: 8.8.8.8 3: 8.8.4.4

dark-swordsman commented 5 years ago

I understand the basics of DNS. If it can't find anything via 10.1.0.110 or it blocks on 10.1.0.110, it'll go through 8.8.8.8 or 8.8.4.4. Having google's DNS on the PCs works, it's just when the lancache slows down, it appears to refuse connections, which tells me it doesn't have enough connections.

I would like to prioritize reducing the timeout of old connections. Such that, if the connection wasn't used in, say, 30 seconds, it kills it. At this point, it seems to keep connections alive for at least 30 minutes. Is there a place I can change this?

As I said, I can add more IPs, and probably will, but there's no point in adding extra IPs if the connections still can't drop. We are an active lan center where we may have 30, 40, or 50+ PCs downloading the same thing or different games at once. I would rather not add 15 IPs for each of the major services.

As far as the maximum number of connections over http, which I believe you said was 2-4, there's no feasible way to override that, right?

nexusofdoom commented 5 years ago

Sorry you are having issue with it. the only issue I have with lancache is steam no other services issues. I have no connection issues 2-4. I can have 10 - 15 games pulling updates down that have not downloaded before and no issue.

nexusofdoom commented 5 years ago

As far as the dns goes unbound forwards the request to the 8.8.8.8 or anything you point it to. so you clients only need the lancache DNS no other, if you use other dns on your clients it is also harder to see what issue you have since the client may be pulling updates direct.

nexusofdoom commented 5 years ago

Use "httpry" and see what the connection is downloading to see what service is requesting the data.

nexusofdoom commented 5 years ago

My setup lancache running inside VM - 8TB storage and two 10gb fiber SFP+ connections to two 48 pot network switches. no issues with losing connections or refuse connections to the lancache VM.

dark-swordsman commented 5 years ago

Hmmm. I mean we do have one 10 Gbps SFP+ fiber, then 4 SFP+ fibers to 4 switches with 24 PCs on each switch, but the servers we have are old R620s with 8 300 GB 10k SAS drives in RAID 0. They can easily push over 1.2 GB/s, but I wonder if that may also be an issue.

nexusofdoom commented 5 years ago

example of my setup image

nexusofdoom commented 5 years ago

its hard to say if its hardware or network interface card. do you have spare pc to test on for a lancache server?

nexusofdoom commented 5 years ago

For me I have to limit how much I pull down for games so it will not kill my voip service or other services. lancache will pull down at 1gb off my internet connection if it wanted to for me. Using my firewall to limit how much the lancache server can pull down from the internet.

nexusofdoom commented 5 years ago

other thing to look at https://wiki.tothnet.hu/index.php/Fix_high_SoftIRQ_on_4.13_kernel

dark-swordsman commented 5 years ago

Hey, so one of our guys worked on our network at some point and fixed an issue with connection bottoming out and capping at 2-3 MB/s. We're working great now and have had at least 5 concurrent users at a time. Was really nice seeing the lancache output 1.5 Gbps not too long ago, even pulling 1.2 Gbps from the internet and caching about 1 Gbps at a time.

In that case I think I'll mark this as solved, but I do have another question. Is it possible to add other drives?

I'm currently looking at RAID controllers and came across this one: https://www.broadcom.com/products/storage/raid-controllers/megaraid-sas-9361-8i

It seems to handle high speed well, but I believe it can only support up to 8 drives, which is fine, but the chassis I am looking at supports up to 16 drives. I think 8 TB may be more than enough in SSDs, but if it isn't would it be possible to add another controller and make another 8 TB array and just add it to a list of drives that can be used?

Edit: Also, does the lancache handle teamed NICs well? I'd like to see if we can use a dual SFP+ connection to get 20Gbps connectivity since with 8 SSD, we could theoretically push about 25-28 Gbps of data.

chong601 commented 5 years ago

I myself run a Lancache instance (although not using the installer script) on 4x Gigabit LACP connection on a Dell PowerEdge R710, works pretty well

dark-swordsman commented 5 years ago

Hey, so it appears that we are running into the speed issue again, capping out at 2-3 MB/s for uncached games. Two people downloading CS:GO and two others downloading RS: Seige which both have updates I believe.

I know you guys mentioned that you can increase the number of IPs for Steam, but can you do it for other services?

Edit: There was a miscommunication and it appears to have been only Steam. I'll see about adding more IPs.

nexusofdoom commented 5 years ago

https://sigtar.com/tag/steamcache/ Update 1/11/2018

Switched back to steamcache/steamcache. steamcache/generic was much slower (re-validated downloads etc) which isn’t needed for my small network. I’m after performance! :)

https://github.com/steamcache/steamcache/blob/master/overlay/etc/nginx/sites-available/steamcache.conf

have to look more and see if there is some tuning options that can be done for Steam.

dark-swordsman commented 5 years ago

So does lancache integrate steamcache? In that case, would I be able to just add the multiple IPs?

dark-swordsman commented 5 years ago

^^^ Just looking for an update on my question.

nexusofdoom commented 5 years ago

You will have to add the IPs by hand and update the configs to make that work.

dark-swordsman commented 5 years ago

Okay, do you have any insight on how to do that? You mentioned you swapped to steamcache/steamcache. Are you running steamcache alongside lancache and disable steam on lancache?

I'm just trying to find a straightforward solution. I would love to spend the time to figure this all out myself, but this is one of many hats I have, and we don't have a lot of people to do this, so some help would be appreciated.

dark-swordsman commented 5 years ago

Hey @nexusofdoom! Thanks again for the help so far.

I realized part of our issue with this lancache was the inactive time on the proxy_cache_paths in lancache-caches. By default it's 120 days. I feel like this would normally be okay, but I am going to test, for now, with an inactive of something like 64 hours (this should align with the operating hours of our lan center and limit a game to 3 days) until we can boost from 1.8 TB of space.

Besides that, the only other issue that I think will still persist is the multiple IPs for steam. I know you said, "You will have to add the IPs by hand and update the configs to make that work," but do you have any insight on how to do that?

Is it simply adding multiple IPs in our interface file and the nginx vhosts config like this, or do we have to do something more complicated? I know for lancachenet/monolithic, it should be as simple as adding IPs to the steamcache-dns docker container, but monolithic was giving me all sorts of issues.

I've gotten more comfortable with nginx configs and dns records, so if I need to dig deep that's fine, but a little guidance would be appreciated.

Edit: Oh! Also, do you know any resources for limiting the amount of bandwidth nginx can use overall? I know nginx has limit_rate, but that's per client and not overall. If the game is not cached and needs to be pulled from the internet, I would like to allow one client to download at 1 Gbps (since our cap is somewhere around 1.5 Gbps), but I don't want 10 clients to each be able to try to download at 1 Gbps (i.e: 1,000 Mbps / 10 clients).

The only solution I could think of is to create another nginx/dns server on another machine, and use that to limit the lancache's speed, but I don't know how many connections the lancache can use beyond 1.

jaretclifton commented 4 years ago

I'm getting the same type of performance. Gigabit fiber connection, 36 IP's added to netplan and limited to around 120Mbps download from any of the main game installers. Thoughts?

nexusofdoom commented 4 years ago

its more then just having the extra IP's you also need to have your dns server to give out the multiple IPs for each of the hostname.

for my unbound I have rrset-roundrobin: yes enabled to do what is needed.

I get about 340 - 400Mbps on my 500Mbps connection.

jaretclifton commented 4 years ago

rrset-roundrobin is already set to yes.

jaretclifton commented 4 years ago

It looks like Steam seems to respond better and with 16 cores it gets around 280Mbps. Origin, Blizzard and Epic all hover around 130Mbps.

dark-swordsman commented 4 years ago

its more then just having the extra IP's you also need to have your dns server to give out the multiple IPs for each of the hostname.

What do you mean by this? From my understanding, once you set netplan to have multiple IPs, there should be a way to give the steam-portion of the DNS config IPs.

Do you mean, "You have to give the DNS server multiple IPs to hand out to clients?" If so, how? I've asked this question multiple times with no answer, just "add multiple IPs to the DNS server". How?

Again, I asked before on how to do this, but you never give any information. Your blog that you provided doesn't help much either. I need technical details, as what you say to do doesn't make sense to me.

This repo is kind of advertised as "plug and play", when in reality, it isn't. I understand that with how much game providers change their systems and what not, it's not easy to keep it working and it requires a lot of tinkering, but you say that yours is working great, but not providing any decent instruction on how to solve the problems.

nexusofdoom commented 4 years ago

Sorry that you are having issues, what lancache are you using?

this is the one I work on https://github.com/nexusofdoom/lancache-installer

I do help out some on this one but I have not updated it to have the needed extra IP's for steam. https://github.com/bntjah/lancache/

I get email every week of people using https://github.com/nexusofdoom/lancache-installer with not issues and sending photos of there speed and bandwidth savings.

I just did a reinstall of ubuntu and ran the installer from https://github.com/nexusofdoom/lancache-installer working as designed no slow downs on first download of content.

Networking or hardware support is outside of the scope of the project. If Support is needed to look at your setup I may be able to look some time this weekend? Maybe team-viewer or other remote support.

nexusofdoom commented 4 years ago

It looks like Steam seems to respond better and with 16 cores it gets around 280Mbps. Origin, Blizzard and Epic all hover around 130Mbps.

Note "EA Origin" has not fixed the issue with lancache it pulls direct over https bypassing lancahce There is a ticket made with them, they are working on a fix something like what Riot did to make it work again" https://answers.ea.com/t5/Origin-Client-Web-Technical/Origin-using-HTTPS-for-Downloads/td-p/7989098/page/4?fbclid=IwAR2YcJqmeT1KffPAqm4MLEcFZgkvqttn0ZiU2aSeErF0QqvW-8HFGqKQCYA

bntjah commented 4 years ago

Well to be fair Nexus; the github page doesn't really have any installer except some stuff that has to be done manually. I haven't added the part of steam multiple ip's yet but it shouldn't be to hard to do when doing a manual install.

If I find the time I will add it; but thanks again for following up on the issues! But from my experience it is sufficient to use the multiple ip's to get "normal" speed from Steam instead of using a single ip...

Op di 17 sep. 2019 om 15:12 schreef nexusofdoom notifications@github.com:

It looks like Steam seems to respond better and with 16 cores it gets around 280Mbps. Origin, Blizzard and Epic all hover around 130Mbps.

Note "EA Origin" has not fixed the issue with lancache it pulls direct over https bypassing lancahce There is a ticket made with them, they are working on a fix something like what Riot did to make it work again" https://answers.ea.com/t5/Origin-Client-Web-Technical/Origin-using-HTTPS-for-Downloads/td-p/7989098/page/4?fbclid=IwAR2YcJqmeT1KffPAqm4MLEcFZgkvqttn0ZiU2aSeErF0QqvW-8HFGqKQCYA

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bntjah/lancache/issues/145?email_source=notifications&email_token=ABQT7UM2OODO3ULJQ4OLDPDQKDJU5A5CNFSM4HFZEU6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD64PFAY#issuecomment-532214403, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQT7UNVZ2MWMVAH2OGGBJ3QKDJU5ANCNFSM4HFZEU6A .

jaretclifton commented 4 years ago

Fresh VM install of Ubuntu 18.04. 16 cores, 8GB RAM and just 650GB of storage to test. All on a 6 drive RAID 10 10k SAS R610. Installed with the lancache-installer and things worked out of the box... slowly. 1Gbps symmetric fiber internet yields ~130-140Mbps for uncached installs (across all providers, blizzard, steam, epic etc.) when flowing through lancache. Same installs bypassing lancache routinely hit 800Mbps. Once cached, installs from lancache top out around 500-600Mbps.

VM is running on Proxmox with write back cache enabled for disk and 8 multiqueue threads assigned for NIC offloading.

Thoughts?

Edit: Moved the VM to another Proxmox host and am seeing sustained 115MB/sec transfers. Sounds like I have some investigation to do on the disk structure of the previous hypervisor. Thanks!

nexusofdoom commented 4 years ago

Working for me on my slower 400Mbps connection for uncached installs image

nagilum99 commented 4 years ago

I understand the basics of DNS. If it can't find anything via 10.1.0.110 or it blocks on 10.1.0.110, it'll go through 8.8.8.8 or 8.8.4.4. Having google's DNS on the PCs works, it's just when the lancache slows down, it appears to refuse connections, which tells me it doesn't have enough connections.

Sorry but regarding Windows it's nonsense: Windows randomly uses one of the DNS servers - we already had lots of troubl with that, because someone decided it's a good Idea to add foreign DNS to the one of an AD Server. After removing the second/foreign, it worked like expected.

So with adding foreign DNS to the cliens, you wrack your setup.

Our next LAN is going to happen in ~2 weeks. We didn't use more than 1 IP per DNS, last year - and got bandwith peaks of 1.75 gbit/s over 3 trunked (LACP) 1 gbps lines.

The host is ran via VMware (we're jumping between VMware and XenServer/XCP-ng) on an older HP DL580 (4x 2.93 GHz QC, 128 GB). "HW"-specs for the VM: 100 GB RAM 6 Cores 24x 73 GB SAS 15k as RAID 50 + 1 spare 24x 146 GB SAS 10k as RAID 50 + 1 spare Both on Smart Array P411 w. 512 MB Cache We split the cache-directorys between the 2 different mounts. This year we'll see how a small SSD-cache affects the proxy. Might fiddle a bit with bcache.

Out WAN-Link was mostly saturated with ~265 Mbit/s and we pulled roughly 2.3 TB and pushed about 4.5 TB.

So last year it was pretty good, though Origin seemed to be problematic to cache. I saw the link and doubt EA will get their homework done until then.

Edit: Let me add: We saw average download speeds peaking at up to 800 Mbit/s on clients - which is pretty nice for a Gbps LAN environment and such proxy.

jaretclifton commented 4 years ago

The lancache instance started exhibiting the same ultra-slow uncached performance as previously. I even went so far as to build a brand new VM image today and it's still getting maybe 7MB/sec download for uncached content.

jaretclifton commented 4 years ago

Pulling from cached content yields ~118MB/sec which is fantastic... downloading uncached content ranges between 300KB/sec to 7MB/sec. I'm absolutely baffled as to what is causing this.

nagilum99 commented 4 years ago

We noticed that, e.g. Blizzard, barely gave more than 3 - 5 MB/s. Seemed like the reason was not on our site, rahter battle.net not delivering more per stream. Added more IPs to steam/blizzard - felt like helping for steam.

jaretclifton commented 4 years ago

I've added 6 more IP's for Blizzard as well as Steam and can't seem to get above 20MB/sec uncached on my 1Gbps fiber connection. This is maddening as I have a LAN event today lol.

nexusofdoom commented 4 years ago

This is what I am getting just now, downloading with Steam image

nexusofdoom commented 4 years ago

And this is what I get if I try and re-download the same game - note testing on my gaming laptop over WIFI. image

nagilum99 commented 4 years ago

Did you check without proxy to verify it's about the proxy? We found out that we didn't get fast downloads at all, proxy was innocent. ...whereas fast was relative. We had > 300 mbit/s for hours on our downlink. I couldn't make out special things that were super slow, seemed to be a bit random in between.

jaretclifton commented 4 years ago

I tested with and without. It appeared that ALL content delivery from Blizzard, Steam and Origin (yes I know of the HTTPS issue currently) gave between 15MB/sec to 30MB/sec with the proxy bypassed. Once proxy was engaged it hovered around 15MB/sec for all uncached... it shot up to 85MB/sec for cached for various clients. Dual 1Gbps fiber LACP links between the proxmox host and core switch as well as the access switch and core switch.

lancache

nexusofdoom commented 4 years ago

Did you update DNS in unbound for more then one IP for steam, and did you roundrobin the results?

rrset-roundrobin: yes
msg-cache-size: 128m

## LANcache config ##

## Steam °|-lc-host-vint:1
local-zone: "steampowered.com." transparent
local-zone: "steamcontent.com." redirect
local-data: "steamcontent.com. 30 IN A lc-host-steamA"
local-data: "steamcontent.com. 30 IN A lc-host-steamB"
local-data: "steamcontent.com. 30 IN A lc-host-steamC"
local-data: "steamcontent.com. 30 IN A lc-host-steamD"
local-data: "steamcontent.com. 30 IN A lc-host-steamE"
local-data: "steamcontent.com. 30 IN A lc-host-steamF"
local-data: "steamcontent.com. 30 IN A lc-host-steamG"
local-data: "steamcontent.com. 30 IN A lc-host-steamH"
local-data: "steamcontent.com. 30 IN A lc-host-steamI"
local-data: "steamcontent.com. 30 IN A lc-host-steamJ"
nagilum99 commented 4 years ago

I tested with and without. It appeared that ALL content delivery from Blizzard, Steam and Origin (yes I know of the HTTPS issue currently) gave between 15MB/sec to 30MB/sec with the proxy bypassed. Once proxy was engaged it hovered around 15MB/sec for all uncached... it shot up to 85MB/sec for cached for various clients. Dual 1Gbps fiber LACP links between the proxmox host and core switch as well as the access switch and core switch.

lancache

If you know how LACP works, you know that it won't affect single clients. Precaching was a bit slow, but during LAN we peaked at 3.75 Gbit/s on 4x 1 G LACP towards LAN, while partly getting 450 Mbit/s from WAN. Also not sure how the 100 % CPU is being measured. Is that 100 % of all cores? Single downloads can be limited by CPU, multiple simultaneously rather by storage.

jaretclifton commented 4 years ago

Did you update DNS in unbound for more then one IP for steam, and did you roundrobin the results?

rrset-roundrobin: yes
msg-cache-size: 128m

## LANcache config ##

## Steam °|-lc-host-vint:1
local-zone: "steampowered.com." transparent
local-zone: "steamcontent.com." redirect
local-data: "steamcontent.com. 30 IN A lc-host-steamA"
local-data: "steamcontent.com. 30 IN A lc-host-steamB"
local-data: "steamcontent.com. 30 IN A lc-host-steamC"
local-data: "steamcontent.com. 30 IN A lc-host-steamD"
local-data: "steamcontent.com. 30 IN A lc-host-steamE"
local-data: "steamcontent.com. 30 IN A lc-host-steamF"
local-data: "steamcontent.com. 30 IN A lc-host-steamG"
local-data: "steamcontent.com. 30 IN A lc-host-steamH"
local-data: "steamcontent.com. 30 IN A lc-host-steamI"
local-data: "steamcontent.com. 30 IN A lc-host-steamJ"

Indeed I did, to both of those questions.

jaretclifton commented 4 years ago

I tested with and without. It appeared that ALL content delivery from Blizzard, Steam and Origin (yes I know of the HTTPS issue currently) gave between 15MB/sec to 30MB/sec with the proxy bypassed. Once proxy was engaged it hovered around 15MB/sec for all uncached... it shot up to 85MB/sec for cached for various clients. Dual 1Gbps fiber LACP links between the proxmox host and core switch as well as the access switch and core switch. lancache

If you know how LACP works, you know that it won't affect single clients. Precaching was a bit slow, but during LAN we peaked at 3.75 Gbit/s on 4x 1 G LACP towards LAN, while partly getting 450 Mbit/s from WAN. Also not sure how the 100 % CPU is being measured. Is that 100 % of all cores? Single downloads can be limited by CPU, multiple simultaneously rather by storage.

Yep I know how LACP works and expected no gains per individual client... src-dest-mac hashing for load balancing worked fantastically when multiple clients requested updates. As for the CPU usage, it appears to be a normalized percentage value of all cores available to the system as mine stayed well under 30% usage on a 16 core VM.

nagilum99 commented 4 years ago

We only gave it 6 cores, but >100 GB of RAM and a bunch of disks. Turned out that more cores don't really make more speed. IO wait should be monitored to see if storage is the bottleneck. In that case threads are only waiting for storage to deliver stuff (or potentially nginx to get it from the internet)