dvyukov / go-fuzz

Randomized testing for Go
Apache License 2.0
4.78k stars 279 forks source link

Provenance of corpus data? #346

Closed eswdd closed 1 year ago

eswdd commented 1 year ago

Hi,

My company doesn't allow us to use Go packages that contain encrypted files without explicit allowlisting once we know the contents of the files. https://github.com/klauspost/compress recently added CICD Fuzz testing and as part of that copied across the zip test data from the go-fuzz-corpus. 2 of the files in that data set appear to be encrypted (according to some heuristics applied on the boundary):

Whilst I appreciate the contents of the files are not relevant for fuzz tests, I am required to verify the contents before I can bring the package in. The first appears to be a password protected zip, for which I cannot find the password anywhere, the second I have no idea.

Would it be possible to provide or describe the provenance of these files so that can I verify the contents? Or perhaps even the password for the zip? Or is the second input generated by go-fuzz itself and just happens to look encrypted?

In case it's of interest, I already asked in klauspost/compress and was directed here (https://github.com/klauspost/compress/discussions/791).

Many thanks Simon

josharian commented 1 year ago

Given the filename and context, with high confidence, 19e22ee834f02d145ae7072020443bc7ff06a965-2 was randomly generated by go-fuzz.

Maybe @dvyukov can speak to 14.zip, but given the number of intervening years, it is unlikely. If I had to guess, I'd say he probably zipped a hello_world.go, mashed the keypad to generate a random password, and moved on. As you say, neither the content nor the password matter to fuzzing, so I suspect he didn't give either of them a moment's thought.

eswdd commented 1 year ago

Turns out I didn't think to use a zip cracker - turns out the password is the blindingly obvious password which gives me the very exciting content:

simon$ unzip 14.zip 
Archive:  14.zip
[14.zip] aaa password: 
 extracting: aaa                     
 extracting: bbb                     
simon$ cat aaa
111222
simon$ cat bbb
asidhuasih a
 sdjn osdn 
sdvs ndv

Thank you for the help.