buger / goreplay

GoReplay is an open-source tool for capturing and replaying live HTTP traffic into a test environment in order to continuously test your system with real data. It can be used to increase confidence in code deployments, configuration changes and infrastructure changes.
https://goreplay.org
Other
18.54k stars 13 forks source link

How to obtain the request id correctly in middleware #925

Open onioner opened 3 years ago

onioner commented 3 years ago

Hi, I encountered some problems when using middleware. There are many duplicate request id in version 1.2.0. But I did not find the request id in version v1.3.0_RC1, I only got http headers and http body in middleware. How shuold I get the uuid of request/response/replay ?

Thx!

hmih commented 3 years ago

Same, I'm not getting the ID anymore on the pre-release.

urbanishimwe commented 3 years ago

which command are you using?

hmih commented 3 years ago
./gor \
    -input-raw ':80' \
    -input-raw-track-response \
    -input-raw-protocol http \
    -middleware ./middleware.py \
    -output-http 'http://10.244.0.89:1337'

where the IP is of a remote host. This is my middleware.py

#! /usr/bin/env python3
# -*- coding: utf-8 -*-

import sys
import fileinput
import binascii

def log(msg):
    """
    Logging to STDERR as STDOUT and STDIN used for data transfer
    @type msg: str or byte string
    @param msg: Message to log to STDERR
    """
    try:
        msg = str(msg) + '\n'
    except:
        pass
    sys.stderr.write(msg)
    sys.stderr.flush()

# Used to find end of the Headers section
EMPTY_LINE = b'\r\n\r\n'

def find_end_of_headers(byte_data):
    """
    Finds where the header portion ends and the content portion begins.
    @type byte_data: str or byte string
    @param byte_data: Hex decoded req or resp string
    """
    return byte_data.index(EMPTY_LINE) + 4

def process_stdin():
    """
    Process STDIN and output to STDOUT
    """
    for raw_line in fileinput.input():

        line = raw_line.rstrip()

        # Decode base64 encoded line
        decoded = bytes.fromhex(line)

        # Split into metadata and payload, the payload is headers + body
        (raw_metadata, payload) = decoded.split(b'\n', 1)

        # Split into headers and payload
        headers_pos = find_end_of_headers(payload)
        raw_headers = payload[:headers_pos]
        raw_content = payload[headers_pos:]

        log('===================================')
        log('RAW METADATA:')
        log(raw_metadata)
        log(raw_metadata.split(b' ')[0])
        #request_type_id = int(raw_metadata.split(b' ')[0])
        #log('Request type: {}'.format({
        #  1: 'Request',
        #  2: 'Original Response',
        #  3: 'Replayed Response'
        #}[request_type_id]))
        log('===================================')

        log('Original data:')
        log(line)

        log('Decoded request:')
        log(decoded)

        encoded = binascii.hexlify(raw_metadata + b'\n' + raw_headers + raw_content).decode('ascii')
        log('Encoded data:')
        log(encoded)

if __name__ == '__main__':
    process_stdin()

Which is basically the one from the examples but with different logging.

This is what my logs are saying:

===================================
RAW METADATA:
b'POST / HTTP/1.1\r'
===================================
Original data:
504f5354202f20485454502f312e310d0a486f73743a2032302e38362e3232382e350d0a557365722d4167656e743a206375726c2f372e36342e310d0a4163636570743a202a2f2a0d0a436f6e74656e742d547970653a206170706c69636174696f6e2f6a736f6e0d0a436f6e74656e742d4c656e6774683a2033350d0a0d0a7b22757365726e616d65223a2278797a222c2270617373776f7264223a2278797a227d
Decoded request:
b'POST / HTTP/1.1\r\nHost: private-ip\r\nUser-Agent: curl/7.64.1\r\nAccept: */*\r\nContent-Type: application/json\r\nContent-Length: 35\r\n\r\n{"username":"xyz","password":"xyz"}'
Encoded data:
504f5354202f20485454502f312e310d0a486f73743a2032302e38362e3232382e350d0a557365722d4167656e743a206375726c2f372e36342e310d0a4163636570743a202a2f2a0d0a436f6e74656e742d547970653a206170706c69636174696f6e2f6a736f6e0d0a436f6e74656e742d4c656e6774683a2033350d0a0d0a7b22757365726e616d65223a2278797a222c2270617373776f7264223a2278797a227d
===================================
RAW METADATA:
b'HTTP/1.1 200 OK\r'
===================================
Original data:
485454502f312e3120323030204f4b0d0a446174653a205475652c203031204a756e20323032312031323a34363a303920474d540d0a436f6e74656e742d4c656e6774683a2031320d0a436f6e74656e742d547970653a20746578742f706c61696e3b20636861727365743d7574662d380d0a0d0a68656c6c6f2c20776f726c64
Decoded request:
b'HTTP/1.1 200 OK\r\nDate: Tue, 01 Jun 2021 12:46:09 GMT\r\nContent-Length: 12\r\nContent-Type: text/plain; charset=utf-8\r\n\r\nhello, world'
Encoded data:
485454502f312e3120323030204f4b0d0a446174653a205475652c203031204a756e20323032312031323a34363a303920474d540d0a436f6e74656e742d4c656e6774683a2031320d0a436f6e74656e742d547970653a20746578742f706c61696e3b20636861727365743d7574662d380d0a0d0a68656c6c6f2c20776f726c64

For this request:

#!/bin/sh

set -e

curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"username":"xyz","password":"xyz"}' \
  http://private-ip/