Closed redbeardmeric closed 3 years ago
Hi @redbeardmeric ,
so I'm not sure what's your numacore map but I fear that your coremask might not be the best,
if all your cores are on socket 1 as your interface, this will be your coremask 0xFFFFFFFFFC
, you can check for this information running usertools/cpu_layout.py
provided by dpdk
when you got the "Failed to allocate TCB" message most likely your tcb-pool-sz
is not big enough, I hence suggest you to maximize it and lower ucb-pool-sz
in order to tune better your config
I suggest you to check out our script in https://github.com/Juniper/warp17/blob/dev/common/python/starter.py which could help you with those kind of issues
Hi @davvore33,
Thanks for the quick response.
cpu_layout.py shows me that I have 2 physical chips, 20 cores each, one on socket 0, one on socket 1. The NIC I'm using it on socket 1, so I am using only the cores on socket 1, plus one from socket 0 for management/cli. That's how I understood it to be the best way to run.
I'm trying to run starter.py but I'm getting a KeyError for line 784 qmaps_res = config.get_qmaps_args()
@redbeardmeric The test config you're running, add tests client tcp port 0 test-case-id 0 src 10.0.0.100 10.0.0.102 sport 10001 60000 dest 10.0.0.253 10.0.0.253 dport 6001 6200
will try to generate 24M connections, ~but the default "connection block" memory pool only allocates 10M~. As @davvore33 suggested you need to increase the value of the tcb-pool-sz
argument ~accordingly~ further more because this pool is distributed across all cores and might not match session distribution. An alternative is to also pass --mpool-any-sock
.
@dceara Thanks for the additional information. I have changed my core mask to as @davvore33 suggested and increased tcb-poop-sz
to 65536 and I get an mpool error.
I'm curious, how are you parsing that will run 24M connections?
I appreciate the help.
@dceara Thanks for the additional information. I have changed my core mask to as @davvore33 suggested and increased
tcb-poop-sz
to 65536 and I get an mpool error.
You might need to increase the memory size available to warp17:
https://github.com/Juniper/warp17#dpdk-command-line-arguments
I'm curious, how are you parsing that will run 24M connections?
I did 3 source IPs * 50000 source tcp ports * 1 dest IP * 200 dest tcp ports
but my math was wrong, it should've been 30M indeed, sorry.
I appreciate the help.
No worries, hope we figure it out.
@dceara Ok, good, I thought that's what you were looking at.
What I'm gathering is that within the limits of my system hardware, if I want to increase the number of IPs that I'm seeing come from Warp 17, I need to decrease the number of sessions per IP. Would this be a correct interpretation?
@dceara Ok, good, I thought that's what you were looking at.
What I'm gathering is that within the limits of my system hardware, if I want to increase the number of IPs that I'm seeing come from Warp 17, I need to decrease the number of sessions per IP. Would this be a correct interpretation?
@redbeardmeric That's an option too, the source port range is quite large (50K) in your example.
@dceara
So I'm using this with the attached file: ./warp17/build/warp17 -c FFFFFFFFFC -m 51426 -- --qmap-default max-c --mpool-any-sock --tcb-pool-sz 65536 --cmd-file ./warp17/examples/Trafficscript.txt
This runs and sampling specific IPs from this range show traffic originating from the IPs in the range. In the UI I see that client sending packets, and the server receiving and sending packets, but the client does not get the TCP responses.
Is there a configuration that I'm missing, or will I not be able to get the responses due to the IP range?
@dceara
So I'm using this with the attached file: ./warp17/build/warp17 -c FFFFFFFFFC -m 51426 -- --qmap-default max-c --mpool-any-sock --tcb-pool-sz 65536 --cmd-file ./warp17/examples/Trafficscript.txt
This runs and sampling specific IPs from this range show traffic originating from the IPs in the range. In the UI I see that client sending packets, and the server receiving and sending packets, but the client does not get the TCP responses.
Is there a configuration that I'm missing, or will I not be able to get the responses due to the IP range?
Looks to me like the server side configuration is wrong. This:
add tests server tcp port 1 test-case-id 0 src 10.1.134.164 10.1.134.164 sport 6001 6001
should probably be:
add tests server tcp port 1 test-case-id 0 src 10.1.134.164 10.1.134.164 sport 6001 6201
I didn't test this out but, IIRC, without the change above, the server will just RST all SYN packets sent with dest port in range 6002-6201.
I'm sure that would have been an issue in the future, but I just implemented this fix and still not getting any server response to the client.
Thanks!
From Trafficscript.txt file
Below line needs to be modified from
add tests client tcp port 0 test-case-id 0 src 10.0.0.3 10.1.134.163 sport 10001 10001 dest 10.1.134.164 10.1.134.164 dport 6001 6201
to
add tests client tcp port 0 test-case-id 0 src 10.0.0.3 10.0.0.4 sport 10001 10500 dest 10.1.134.164 10.1.134.164 dport 6001 6100
and as dceara mentioned
Server test should be modified from add tests server tcp port 1 test-case-id 0 src 10.1.134.164 10.1.134.164 sport 6001 6001
to
add tests server tcp port 1 test-case-id 0 src 10.1.134.164 10.1.134.164 sport 6001 6100
Above configuration changes should work to achieve 100K(2(client IPs) 500(source ports) 100(dest ports)* 1(server))
2500100*1 = 100K.
@chandrasheker
Yes that gets 100k sessions, but I'm specifically trying to get 100k IPs which add tests client tcp port 0 test-case-id 0 src 10.0.0.3 10.1.134.163 sport 10001 10001 dest 10.1.134.164 10.1.134.164 dport 6001 6201
accomplishes. The only issue is that the server responses are not making it back to the 100k clients. Even with add tests server tcp port 1 test-case-id 0 src 10.1.134.164 10.1.134.164 sport 6001 6100
modification.
The more I think about it, I think maybe the sessions might be closing too fast to receive the response.
100k IPs?? No AFAIU about WARP17 we do not support so many IP address to configure and run tests.
https://github.com/Juniper/warp17/blob/dev/common/api/warp17-common.proto#L74 this value restricts the IP addresses to be configured to max of 10.
@dceara and @davvore33 correct me if I'm wrong.
Only 10 @ L3. The file I attached has 100k IPs @ L4, which is working. Just the issue with the responses.
"show arp entries" output please.
Just as a suggestion can you modify l3_gw on server and client ports as below in Trafficscript.txt file and let me know the behavior if possible.
add tests l3_gw port 0 gw 10.1.134.164
add tests l3_gw port 1 gw 10.0.0.1
@chandrasheker Apologies for the drop out. I have modified l3_gw on server and client and responses are now getting through. Thanks!
I now have 100k IPs w/ 40 sessions each transmitting and receiving.
Thanks everyone for the help.
I'm trying to create a test that will simulate a traffic load from many IP addresses, the number thrown at me was 100k.
./warp17/build/warp17 -c AAAAAAAAAB -- --qmap-default max-q --tcb-pool-sz 32768 --cmd-file ./warp17/examples/LATech_MultipleIP.cfg
I'm running on a 2 socket 40-core system, the NIC is on socket 1, hence the 0xAAAAAAAAB.
I have 32768 * 2MB huge pages.
I've gotten 3 simultaneous source IPs using: add tests client tcp port 0 test-case-id 0 src 10.0.0.100 10.0.0.102 sport 10001 60000 dest 10.0.0.253 10.0.0.253 dport 6001 6200
If I try to increase beyond this point I get "Failed to allocate TCB."
Is 100k possible? If not, how high could I reasonable push it and what am I doing wrong?
Thanks