Closed amarekano closed 3 years ago
This is great, thanks for writing down your thoughts this way!
I think at this point my main question is whether we think that the TemplateLiterals need to be valid JavaScript code or not. Here is my understanding of what we currently have:
eval
LoadString
) that will mostly just contain random string data (sometimes random characters, sometimes things like property or method names, etc.). Those are pretty much for every other use case of strings other than passing it to eval
I'm not sure that we need these new template literals (which do interpolation) to also contain valid JavaScript code, since we already have the CodeStrings for that, and it seems like ensuring that the result after interpolation is still valid JavaScript code is probably non-trivial? In that case, the implementation would likely be simpler, since it can be a single Operation CreateTemplateLiteral
that e.g. contains a let parts: [String]
, has parts.count - 1
inputs, and then lifts to something like
`part0${input0}part1${input1}part2`
(where input0
is whatever expression the lifter has for input(0)
, etc.)
What are your thoughts on this?
Thanks for the feedback!
I think at this point my main question is whether we think that the TemplateLiterals need to be valid JavaScript code or not.
My main motivation to implementing TemplateLiterals in FuzzIL is to allow importing testcases from the various JS engine repos (for example about a quarter of the testcases for JSC are implemented using templates. Here's an example) and I think it would be easier to implement this natively in FuzzIL rather than having a parser lift templates to JS and then compile JS to FuzzIL. Having said that I think TemplateLiterals should be valid JS but we need not validate that in fuzzilli.
I'm not sure that we need these new template literals (which do interpolation) to also contain valid JavaScript code, since we already have the CodeStrings for that, and it seems like ensuring that the result after interpolation is still valid JavaScript code is probably non-trivial? In that case, the implementation would likely be simpler, since it can be a single Operation CreateTemplateLiteral that e.g. contains a let parts: [String], has parts.count - 1 inputs, and then lifts to...
I think your approach with using CreateTemplateLiteral
as you've described previously would work for imported testcases. We ensure that the string parts of the template are immutable and that the identifiers/expressions of the template are mutable. The only change in the implementation that I would suggest is recording type information for the inputs so that we can perform relatively more intelligent mutations.
I don't believe there's much value in generating templates that aren't valid JS since at that point we may as well use LoadString
to generate random strings.
WDYT?
Cool, thanks for linking those examples, that makes sense! Yeah I agree, for supporting those, it's probably easiest to go with a single CreateTemplateLiteral
operation and have the FuzzIL compiler split the literals into string parts and expressions. Then, the example you linked would effectively consist of four CreateTemplateLiteral
instructions and some BinaryOperations for the concatenations etc.
I'm not sure if adding type information will work for that use case though, since the mutations (in particular, the InputMutator) anyway completely ignore type information. The only case I can think of is splicing, when it rewires some of the inputs. But then you'd either need the compiler to be able to figure out type information of those inputs, which it isn't capable of, or somehow do some post processing when importing the samples into Fuzzilli to determine the current types. Not sure if it's worth the effort?
I don't believe there's much value in generating templates that aren't valid JS since at that point we may as well use LoadString to generate random strings.
I guess it depends on what you are trying to fuzz. But yeah, for the use case you listed, it should be fine to use them as code without trying to enforce that in FuzzIL somehow.
But then you'd either need the compiler to be able to figure out type information of those inputs, which it isn't capable of, or somehow do some post processing when importing the samples into Fuzzilli to determine the current types. Not sure if it's worth the effort?
Yes, you're right. Both these options are non-trivial to implement and I feel for now it's probably not worth the effort.
I've started putting the PR together for this and I've run into an issue with inlining that I'm unable to work around. Here's a testcase that I've written:
let v0 = b.loadInt(1337)
let v1 = b.createTemplateLiteral(with: ["Hello", "World"], andIdentifiers: [v0])
let program = b.finalize()
let lifted_program = fuzzer.lifter.lift(program)
let expected_program = """
let v0 = 1337;
let v1 = `Hello${v0}World`
Due to the inlining policy, literals are inlined and that results in the lifted program being let v1 = `Hello${1337}World`
. I'm not sure on how I can temporarily disable this policy based on context. What I would like to do is disable/override inlining when lifting in the context of certain instructions. Any thoughts on how I should go about implementing this?
So why do you want to disallow inlining in that case? I think it's fine for the lifter to behave the way it does for other instructions, i.e. inline literals but pretty much nothing else. Then you'll still get things like the example you linked working just fine while avoiding code such as
const v2 = 1337;
const v3 = `Hello${v2}World`;
which is a bit less intuitive to read for a person.
It's probably going to be hard to selectively disable inlining because the lifter has to decide whether to emit a variable declaration when it processes the LoadInt
, not when it uses it in the CreateTemplateLiteral
.
For some reason I thought this would error out (i.e. inlining literals in the template) but testing shows that inlining works! So we should be good here.
implemented in #208
Hi everyone, I'm going to take a stab at adding support for template literals to the fuzzer. I wanted to gather thoughts/feedback on how operations and lifting should be designed.
Typically template literals have the following structure:
A series of js snippets interspersed with expressions/identifiers that are wrapped in
${}
. If we think of the expressions/identifiers as parameters to a function (where the function here would be the template literal), then template literals can be represented quite easily by duplicating theBeginAnyFunctioDefinition
andEndAnyFunctionDefinition
. An example of what that might look like is listed below:If we agree to using these operation definitions then code generation becomes fairly straight forward: ProgramBuilder.swift:
CodeGenerators.swift:
Lifting Templates would be something as follows:
Design Considerations:
Literal
should work here but I'm not sure if we should create a new expression just for templates and if there are any benefits to doing that. One consideration may be that inlining expressions may trigger different engine behaviour to simply adding variable identifiers.BeingCodeString
andEndCodeString
and it might be worth considering extending these existing operations rather than creating new ones.BeginTemplateLiteral
andEndTemplateLiteral
) are similar in design toBeginAnyFunctionDefinition
, I think we would not need to any additional analysers, mutators, reducers, etc and reuse the existing code infra.Let me know if there are any other considerations/design choices to consider and I'll cobble together a PR once we arrive at a consensus.