Closed Coffee0297 closed 4 years ago
cache@cache:~$ ethtool -S ens1f0
NIC statistics:
rx_packets: 1366507
tx_packets: 1068224
rx_bytes: 1298061333
tx_bytes: 1173691065
rx_broadcast: 305743
tx_broadcast: 49
rx_multicast: 133397
tx_multicast: 351
multicast: 133397
collisions: 0
rx_crc_errors: 0
rx_no_buffer_count: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_window_errors: 0
tx_abort_late_coll: 0
tx_deferred_ok: 0
tx_single_coll_ok: 0
tx_multi_coll_ok: 0
tx_timeout_count: 0
rx_long_length_errors: 0
rx_short_length_errors: 0
rx_align_errors: 0
tx_tcp_seg_good: 68325
tx_tcp_seg_failed: 0
rx_flow_control_xon: 0
rx_flow_control_xoff: 0
tx_flow_control_xon: 0
tx_flow_control_xoff: 0
rx_long_byte_count: 1298061333
tx_dma_out_of_sync: 0
tx_smbus: 121996
rx_smbus: 175
dropped_smbus: 0
os2bmc_rx_by_bmc: 0
os2bmc_tx_by_bmc: 0
os2bmc_tx_by_host: 0
os2bmc_rx_by_host: 0
tx_hwtstamp_timeouts: 0
tx_hwtstamp_skipped: 0
rx_hwtstamp_cleared: 0
rx_errors: 0
tx_errors: 0
tx_dropped: 0
rx_length_errors: 0
rx_over_errors: 0
rx_frame_errors: 0
rx_fifo_errors: 4
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_queue_0_packets: 1243
tx_queue_0_bytes: 477744
tx_queue_0_restart: 0
tx_queue_1_packets: 8193
tx_queue_1_bytes: 9596827
tx_queue_1_restart: 0
tx_queue_2_packets: 900021
tx_queue_2_bytes: 1130597386
tx_queue_2_restart: 0
tx_queue_3_packets: 2962
tx_queue_3_bytes: 845376
tx_queue_3_restart: 0
tx_queue_4_packets: 1760
tx_queue_4_bytes: 354799
tx_queue_4_restart: 0
tx_queue_5_packets: 7776
tx_queue_5_bytes: 10379837
tx_queue_5_restart: 0
tx_queue_6_packets: 22105
tx_queue_6_bytes: 9052730
tx_queue_6_restart: 0
tx_queue_7_packets: 2168
tx_queue_7_bytes: 333553
tx_queue_7_restart: 0
rx_queue_0_packets: 105146
rx_queue_0_bytes: 38577292
rx_queue_0_drops: 2
rx_queue_0_csum_err: 0
rx_queue_0_alloc_failed: 0
rx_queue_1_packets: 944437
rx_queue_1_bytes: 1143155904
rx_queue_1_drops: 1
rx_queue_1_csum_err: 0
rx_queue_1_alloc_failed: 0
rx_queue_2_packets: 69554
rx_queue_2_bytes: 32732228
rx_queue_2_drops: 0
rx_queue_2_csum_err: 0
rx_queue_2_alloc_failed: 0
rx_queue_3_packets: 51929
rx_queue_3_bytes: 18370257
rx_queue_3_drops: 0
rx_queue_3_csum_err: 0
rx_queue_3_alloc_failed: 0
rx_queue_4_packets: 88170
rx_queue_4_bytes: 26347821
rx_queue_4_drops: 0
rx_queue_4_csum_err: 0
rx_queue_4_alloc_failed: 0
rx_queue_5_packets: 9816
rx_queue_5_bytes: 3686301
rx_queue_5_drops: 0
rx_queue_5_csum_err: 0
rx_queue_5_alloc_failed: 0
rx_queue_6_packets: 34352
rx_queue_6_bytes: 8841264
rx_queue_6_drops: 0
rx_queue_6_csum_err: 0
rx_queue_6_alloc_failed: 0
rx_queue_7_packets: 62680
rx_queue_7_bytes: 20836230
rx_queue_7_drops: 1
rx_queue_7_csum_err: 0
rx_queue_7_alloc_failed: 0
no errors on nic either
Ooi, what is the write speed of the disk for your client machines?
Also, your final image seems to be showing ~650mbit/s and yours logs only show one host. So it appears you are getting closer to 80MB/s?
its downloading to clients m.2 disk so thats fast enugh. the steam clien only reports 50MB/s thats what weird...
-Tonny
here is with 2 1G links to the cache, and 2 ppl with nvme 1000MB/s+....
i rarely hit 1g overall.. so 50MB/s pr system. both connected with 1G to swich.
the cache are transfering from RAM and not disk for this game.
EDIT: both links are beeing used so LACP are working
-Tonny
Its quite possible that both clients are going over the same 1gbit link. I don't know what your hashing policy is on yout LACP. You have 2 Client IPs, 1 server IP and 1 port, the likelyhood that you both share the same 1 gig connection for this link is high. In which case you would have a numerical possible max of ~60MB/s each.
edit: if that bottom picture is both links being used then forget this thought.
Try updating your cache docker image to the latest (and dns if you are using this). You may be limited by the behavior of the client. We have recently integrated a new approach steam is taking that may show better results.
the cache are using ballanced-tlb. so it is using both links... ill try the new update and report back. and what do you mean with: "if you are using lancache-dns?" i was under the impression that lancache-dns are needed for this to work?
It is certainly the easiest dns scheme, however there are myriad ways to arrange dns so we don't assume that you are definitely using our container.
okay good to know. the annoint thing is just that systemd-resolve are using port 53. so i have to dissable that before i can start the docker. and enable if i have some updates or items i wanna install... kinda inconvenient
the update did nothing for me at all. still stuck at ~40MB/s. and mostly less for 2 machines hitting the cache -Tonny
this is a pic of the trafic if i have 1 client downloading. that is to an nvme disk so its not the disk thats holding it back... im getting avrage 30-40MB/s when only 1 client...
i hope you have some posible solutions for this..
^^complete download fromcache of csgo 9.8g from 8 disks in raid 0 benchmarkd to 1048MB/s
^^ complete download from cache of csgo 9.8G from RAM.
both are downloaded to a m.2 nvme with 1000MB/s+
its weird. nothing indicates it shuld throttle. but it do it anyway. i need some ideas to see how i can improve this.. All help are apreciated.
-Tonny
the bottom picture looks like you are getting upto 800mbit out of a 1gb interface. I would say that's about the max i'd expect. You never know what a client will actually do in terms of handling file requests and unzipping as it downloads. Its not like pulling one large file where you can get lowest possible overhead and even tweak network parameters to get better throughput.
This solution is ultimately designed to have 1 (or few) internet connections available to 20-3000+ lan clients. When we test our event cache if we can download 3 people at full speed we call that working. I think your cache is working fine.
@MathewBurnett okay thanks.... im just afraid its not gonna be enugh trughput. what wuld be good hardware for 300+ ppl? and can i take 2 or more servers and have the cache on all and just do some load balancing on the dns? or is 1 very powrfull recomended?
I have run a few hundred people of less than you have. The trick is to remember that you don't "need" one. However things are certainly improved by having one. Some larger LANs provide 100mbit to the user which helps with other issues. Saturating up-links can cause problems with other traffic. That said we have about 20 people on a 2gbit uplink and don't have issue.
These days we work with plenty of ram (too much really), i forget how much but its in excess of 120G. We have about 6T of rust and a 2T SSD cache tier, i think the maths says that 1g nginx mem would allow for 8T of disk (the rest of the box memory then allows linux to promote files to mem as it does). Thats all sitting on a 10gbit nic (but we could get away with 2gbit).
When and where is your event?
yea you are right. we have a 1G connection to all users where there is a fairshare on the wan. so if we can push all downloads trugh the cache it wuld help alot. The event are in denmark 27/02-01/04. If this works well and i can get lacp working as i want we are gonna upgrade to ssd's/m.2 and 10G coreswich.
-Tonny
im getting ~40MB pr second from the cache, i have 6-7 15k rpm disks in raid 0 for storage and 3 in raid 5-6 for system and 2x xeon cpus, 32gb ram. ubuntu 18.04.4 LTS.
i have tested my network speed from client to server to 850mb/s
i have tested my raid speed to ~1000mb/s
my link is 1000mb/s from server to network
i cant seem to find the issue, dmesg is clear.....
-Tonny