Randomize code formatting

lydell / eslump

Fuzz testing JavaScript parsers and suchlike programs.

MIT License

57 stars 6 forks source link

Randomize code formatting #1

Closed not-an-aardvark closed 7 years ago

not-an-aardvark commented 7 years ago

I've been looking into using eslump for fuzz-testing ESLint. So far, it's been very useful and has already found a few bugs in rules. Thanks!

As far as I can tell, this module randomly generates an AST, but does not randomize how the AST is formatted. For example here's some code generated with eslump.generateRandomJS(). The code is gibberish, but its formatting is very consistent.

for (const fffomhowjo in this) try {
  class mnrvq {}
  debugger;
} catch (ilfbunnhxnk) {
  while (class {}) r: ;
  throw arguments;
}
export class a {}
export {} from "ä";

This makes eslump useful for testing rules that check the AST, but it's not as useful for testing rules that deal with formatting. It would be nice if eslump could randomize the AST printing as well.

Potential formatting-related randomizations:

Insert random linebreaks/whitespace (when the syntax allows it)
Parenthesize random expressions
Remove semicolons sometimes

lydell commented 7 years ago

Good idea!

Have you tried the --comments option? It's not obvious at all, but I think it does partly what you're looking for: it turns all of that nicely formatted whitespace into random whitespace, with random comments in it.

not-an-aardvark commented 7 years ago

Yes, I've tried the comments option. To clarify, does it also insert/remove whitespace in places where it doesn't insert comments? (It's a bit hard to tell since the resulting code is so hard to read, as expected.) If so, that solves the whitespace issue.

lydell commented 7 years ago

Without the comments option, you get that nicely formatted output, as you showed in the first post. The comments option is a hack on top of that. It replaces all whitespace in that nicely formatted output with random sequences of insignificant JS. That includes spaces, tabs, newlines, other whitespace as well as comments. For each "formatted whitespace", one or more of those things picks are picked. So sometimes you get only whitespace, sometimes only comments, and sometimes both.

One way to go here is to add more "hacks" for semicolons and parentheses. A more ambitious approach would be to move the "random codegen" to a separate project (this module was only ever intended to be a CLI). An even more ambitious approach would be to take the lessons learned in shift-fuzzer and write a new fuzzer for ESTree (including JSX and Flow support.)

I think at least the parentheses thing might be doable as a "hack" without too much effort. I'll try to look into that soon.

If I'll ever get around the more ambitious stuff – we'll see. I would really like to do it, but it is a large project.

lydell commented 7 years ago

Please try out 1.5.0 and see what you think!

not-an-aardvark commented 7 years ago

Looks great. Thanks!