buger / goreplay

GoReplay is an open-source tool for capturing and replaying live HTTP traffic into a test environment in order to continuously test your system with real data. It can be used to increase confidence in code deployments, configuration changes and infrastructure changes.
https://goreplay.org
Other
18.42k stars 4 forks source link

~25% of Requests are not captured by goreplay when compared with production #1247

Open Udaykumar519 opened 3 months ago

Udaykumar519 commented 3 months ago

Hello all,

we have setup goreplay to redirect requests from production server to test server

On production, we have threshold of ~38 requests/sec

we found that ~25% of requests are not being captured by goreplay when compared with production

we have tried increasing buffer size to 10MB and 20MB respectively, still the issue persists

command we used: sudo ./gor --input-raw :8983 --input-raw-bpf-filter "dst port 8983 and inbound" --input-raw-allow-incomplete --input-raw-buffer-size 20971520 --output-http="http://test-server-ip:8983"

Can someone please let us know the reason for this behaviour?

buger commented 3 months ago

Hi! Can you confirm that those requests are some heavey POST requests with big bodies?

Udaykumar519 commented 3 months ago

Hi @buger , Pls note those are GET requests

Sample URL: /search/ims?q=neck+pillow&options.start=14&source=dir.search&glusrid=217882109&visitorid=1264510857&geo_country_info.geo_country_name=India&geo_country_info.geo_country_code=IN&geo_country_info.geo_country_ip=223.190.87.71&implicit_info.for_country.data=IN&implicit_info.for_country.type=India&AK=eyJ0eXAiOiJKV1QiLCJhbGciOiJzaGEyNTYifQ.eyJpc3MiOiJVU0VSIiwiYXVkIjoiNyoxKjAqNSo5KiIsImV4cCI6MTcxMjcyNDAwNCwiaWF0IjoxNzEyNjM3NjA0LCJzdWIiOiIyMTc4ODIxMDkiLCJjZHQiOiIwOS0wNC0yMDI0In0.upQauAJeThqPQa3Fawn5CgKP_oFvBPwH3ftxiky3Zw4&options.filters.mcategoryid=32609&options.filters.categoryid=609&is_translate=false&ip=

Even if the number of characters in URL is higher, doesn't below parameter helps [if it helps, we already used this parameter]? "--input-raw-allow-incomplete"

Pls let us know if there is anything to note here

Udaykumar519 commented 3 months ago

Hi @buger ,

Found interesting finding related to missing requests at GOR:

as a recap: for most of the GET requests the flow is as below [goreplay is capturing successfully here] end-user --> load balancer --> varnish --> goreplay

Here, gor output doesnot contain message body

Example output by gor: 1 caf823170a7f01041ac0d68f 1712732785283674737 0 GET orig_request HTTP/1.1 accept-encoding: gzip,deflate user-agent: node-fetch/1.0 (+https://github.com/bitinn/node-fetch) connection: close accept: / Host: imsearch.indiamart.com:8983

ISSUE: Some of the GET requests comes through Gateway [Here, gor is not capturing most of the requests of this type] end-user ---> gateway ---> load balancer ---> varnish ---> goreplay

Here, gor output also contains message body, along with extra headers

Example output by gor: 1 cd6023170a80044419b2fe2f 1712733128572194909 0 GET orig_request HTTP/1.1 HTTP/1.1 connection: Keep-Alive x-forwarded-proto: https x-forwarded-for: 49.15.249.104, 34.102.235.255 via: 1.1 google x-cloud-trace-context: 9add014c51c8b0114c27c829fc61ebde/17754672102780656855 accept-encoding: gzip content-length: 657 content-type: application/x-www-form-urlencoded authorization: Basic Y2F0ZWdvcnk6Y2F0ZWdvcnlfMTIzNDU= user-agent: Dalvik/2.1.0 (Linux; U; Android 13; CPH2325 Build/TP1A.220905.001)/IM-App/old/13.2.7_13MAR24/612 host: imsearch.indiamart.com:8983 eg-request-id: 70dUvnHBwS6SxDsnb5vcT1

biztype_data=&VALIDATION_GLID=192308361&APP_SCREEN_NAME=Search&options_start=0&options_end=9&AK=eyJ0eXAiOiJKV1QiLCJhbGciOiJzaGEyNTYifQ.eyJpc3MiOiJVU0VSIiwiYXVkIjoiNyo4KjYqMCozKiIsImV4cCI6MTcxMjgxOTE2MCwiaWF0IjoxNzEyNzMyNzYwLCJzdWIiOiIxOTIzMDgzNjEiLCJjZHQiOiIxMC0wNC0yMDI0In0.SRAa7UtNVodgTBzsis5MZP3Z-W5MHfCdBro3g6_SkMY&source=android.search&implicit_info_latlong=&token=imartenquiryprovider&APP_USER_ID=192308361&implicit_info_city_data=&APP_MODID=ANDROID&q=Fridge&modeId=android.search&APP_ACCURACY=316.5&prdsrc=1&APP_LATITUDE=20.379368&APP_LONGITUDE=77.63924&VALIDATION_USER_IP=49.15.249.104&app_version_no=13.2.7_13MAR24&VALIDATION_USERCONTACT=7083630534

Kindly go through this, would be of great help if you could pinpoint something here

buger commented 3 months ago

There is one more thing you can do, is to increase buffer on the OS level: sudo sysctl -w net.core.rmem_max=26214400

Additionally set the following flag to goreplay (or bigger value): --input-raw-buffer-size 10485760

One more thing you can do, is to record the traffic using tcpdump using pcap sudo tcpdump -i any -w capture.pcap port 50100 and later feed it to goreplay like this: gor --input-raw ./path-to-file.pcap:50100 --input-raw-engine pcap_file --output-stdout

Thanks!

Udaykumar519 commented 2 months ago

Thank you, will try them

tuxflo commented 2 months ago

I'm facing similar issues. I already tried setting the rmem_max value and increased the buffer-size, but this didn't help. For the reference, here is my complete command:

sudo ./gor --input-raw :50100 --output-file test.gor --input-raw-protocol binary --input-raw-override-snaplen --recognize-tcp-sessions --input-raw-buffer-size 10485760 --input-raw-track-response --input-raw-allow-incomplete --verbose 10 --input-raw-buffer-timeout 10s --input-raw-expire 10s

The capture file is about 5MB:

ls -l test_0.gor
-rw-r----- 1 root root 5395582 Apr 10 22:06 test_0.gor

The input file is 8MB and the pcap also reflects that:

ls -l test1.pcap
-rw-r--r-- 1 tcpdump tcpdump 8848918 Apr 10 22:10 test1.pcap

When I try to replay the traffic from the pcap file I get the following output:

$ ./gor --input-raw ./test1.pcap:50100 --input-raw-engine pcap_file --output-stdout
BPF Filter: (tcp dst port 50100)
2024/04/10 22:15:37 [PPID 3290610 and PID 3291898] Version:1.3.0
2024/04/10 22:15:37 can not identify link type of an interface 'pcap_file'