tsenart / vegeta

HTTP load testing tool and library. It's over 9000!
http://godoc.org/github.com/tsenart/vegeta/lib
MIT License
23.51k stars 1.36k forks source link

dumping to CSV includes gobbledygook, bloats file #295

Closed calebharris closed 6 years ago

calebharris commented 6 years ago

Version and Runtime

Version: v8.0.0
Commit: 66f3db7f7dcc749f10144cbe4289f32adae346d3
Runtime: go1.10.3 darwin/amd64
Date: 2018-06-12T13:54:46Z+0100

Expected Behaviour

vegeta dump should output a CSV or JSON file with the six columns specified in the README.

Actual Behaviour

dump produces a CSV with 10 columns, the 7th of which has a huge amount of data -- maybe an encoded form of the request and response?

Haven't looked at the JSON output yet, but have verified that it's larger than the original .bin file.

If those columns are supposed to be there, flags should exist to turn off including the full request/response in the dump file.

Steps to Reproduce

  1. Capture an attack
  2. Run the dumper on the .bin file
  3. Verify that the extra columns exist

Additional Context

I'm happy to send along my test files, if needed.

tsenart commented 6 years ago

vegeta dump should output a CSV or JSON file with the six columns specified in the README.

The README is out of date. We introduced the feature that Vegeta outputs response bodies in the dump command here: https://github.com/tsenart/vegeta/releases/tag/v7.0.0

The body is base64 encoded.

If those columns are supposed to be there, flags should exist to turn off including the full request/response in the dump file.

This can be easily filtered with jq or csvkit.

calebharris commented 6 years ago

Ah, that explains it. I did end up using jq to filter it, and will take a look at csvkit. I was hoping to keep the filtered output around and use it to feed vegeta report, but it doesn’t look like JSON is a valid input to report. That would be super convenient. Thanks for responding!

tsenart commented 6 years ago

You can store the output of vegeta attack and feed it to report. It’s in a binary format called Gob. Would that work for you?

tsenart commented 6 years ago

Please re-open if you have further issues or questions.

calebharris commented 6 years ago

Yes, I've been making extensive use of the ability to save the attacks and feed them to report. Which is why I'm running low on disk space. My end goal is to have a thing to feed to report that doesn't include the response bodies (since I'm only interested in the metrics), in order to save significant space, while still collecting a large number of samples. I'm sure there are tools out there to transform Gob -- but if I manage to remove the response bodies, will I still be able to feed the output to report?

tsenart commented 6 years ago

Would compressing the Gob results with something like Gzip fit your space constraints? Please give it a try and let me know your thoughts.

echo "GET http://..." | vegeta attack -rate=1000 | gzip -9 | tee results.gob.gzip | gzcat | vegeta dump
calebharris commented 6 years ago

That certainly helps. Got a 4.1 GB file down to 600 MB. Of course, the JSON with bodies removed is only 2.1 MB. If I have time, I'll try submitting a pull request with options for eliding the body in the attack recording. Thanks for taking the time to respond!

tsenart commented 6 years ago

@calebharris: Please fill in a feature request issue to track that and reference this issue there.