Closed dblotsky closed 6 years ago
@FedericoAureliano thanks for the clarifications. A few follow-up questions:
Also, I've finally dug them up, and the transformations from Ifaz's work were:
I'll send you the paper via email, once I get a copy from Ifaz.
@dblotsky
- How would you summarise, in words, the mechanism by which litfuzz fuzzes literals?
Basically when it arrives at a literal it modifies it to a "similar" literal. For strings this means that we iterate through the characters, for each character we either delete the character, keep it, or replace it with a random (short) string based on some probability. For a literal integer k this means replacing k with k+r where r is a random number between -k and k.
- I noticed that litfuzz also changes the structure of regexes. Is that expected?
I added in some regex operator modifiers that will, with some probability, change a regex operator to another with the same arity and argument types. So for example a * can become a + and vice versa. This was meant to produce more variety but perhaps doesn't belong here.
- What do you think of combining the extend/prune transformers into one that does both?
That's a good idea. I'll do that.
I'll send you the paper via email, once I get a copy from Ifaz.
Thanks! Rotating actually goes beyond just arguments so it should be different, but complementary to, permute. Adding those in will be great.
Hey @dblotsky,
What do you think of the updates? I think I addressed all the issues.
We now have three new transformers:
@FedericoAureliano thanks for the clarifications and the modifications. I've left some more comments. Sorry for the delay!
Hey @dblotsky, I addressed all the comments except for a permanent fix for types. I think they should live inside the nodes themselves but I don't see an immediate clean way to do this. Literals, Identifiers, and Expressions should have a type signature, but having it inside ASTNode, the only class that unites them, seems too high up.
What do you think the best solution is for types?
@FedericoAureliano thanks for the changes! Yeah, I think it would be ok to keep the type signature data in ASTNode
and its subclasses. But that can be done in the future. The current separate file should be ok for now.
There's one gotcha that I came across: we can't distinguish between the type () -> String/Int/Bool
(i.e. a constant or variable) and String/Int/Bool -> ()
, but I can't immediately think of anything that has type String/Int/Bool -> ()
.
Actually, I can: (assert X)
. Its type in this type scheme would be BOOL
, same as the type of a Boolean identifier.
That's a good point. We'll have to be careful with that when we build in types. We should also build in identifier types in the process.
How should we go about doing that, and how high of a priority is it? I think it deserves a separate branch.
@FedericoAureliano yep, I agree that it's something for a later time. We could also work around it by just changing a type definition to be more like the real thing: X -> Y
. For now, this looks good to me.
Last questions:
graft
be able to graft more than 1 branch of each type? If not in this PR, in a later one?Hey @dblotsky sorry for the slow response. I added Lfaz's transformers (translate, reverse, multiply).
I've been testing the transformers on the problems generated by stringfuzzg at different depths. stringfuzzg concats and regex have been especially useful for this.
I think graft is OK with one branch for each type because each modification produces a problem that is different enough from the original to test the solvers and we can repeat the process for even more variation.
That being said, we could let the user specify this with command line arguments. Graft and multiply would perhaps benefit the most from this out of all the transformers.
Hey @FedericoAureliano, those look pretty great to me! I left a few comments on the new code. Speaking of it: what do you think of splitting them off into a second PR? That way you can close the previous work as "done", and focus on finishing up only the new transformers.
Also, one small thing: it is actually a capital "i" in "Ifaz", not an "L". Would you be ok changing that in the commit message and the PR? You can squash the last two commits and force-push to this branch. GitHub should still keep the PR intact. Alternatively, if you're ok with splitting the new transformers off into a new branch and PR, you can just create new commits for them.
Also @FedericoAureliano, I noticed that graft
just slices off a node in this example:
stringfuzzg regex -t 2 -m 10 -x 10 | stringfuzzx graft
changing this:
(declare-fun var0 () String)
(assert (str.in.re var0 (re.++ (str.to.re "#hAhc<w'{:") (str.to.re "R5x$!PCZJ-"))))
(check-sat)
to this:
(declare-fun var0 () String)
(assert (str.in.re var0 (str.to.re "#hAhc<w'{:")))
(check-sat)
This seems like it's is because the leaf node gets swapped with its predecessor, and the predecessor becomes a disconnected component of the graph. What do you think is a good way of fixing this?
Also @FedericoAureliano, I've found and fixed some scanner bugs which you helped uncover. Turns out I was treating integers as identifiers, and not parsing Boolean literals at all! 😢 They're fixed in 106bcc1, so you can rebase on top of master, or just merge with it.
Sorry for so many messages!
Speaking of it: what do you think of splitting them off into a second PR? That way you can close the previous work as "done", and focus on finishing up only the new transformers.
Will do!
Also, one small thing: it is actually a capital "i" in "Ifaz", not an "L".
My mistake! Sorry Ifaz if you see this :P
I noticed that graft just slices off a node in this example
Yup, since the swapping of the expr and literal happens on the same "path" (the expr is an ancestor of the literal) you get this effect.
What do you think is a good way of fixing this?
I had intended this behaviour so that the transformer will still have an effect on single path examples, but am open to changing the behaviour.
They're fixed in 106bcc1, so you can rebase on top of master, or just merge with it.
Awesome! What I'll do is take Ifaz's transformers out of this PR, merge this PR with master, and then make a new branch and PR for Ifaz's transformers. Does that sound good?
Thank you @dblotsky for all the code reviews! Let me know if I can help with the paper (proof read or whatever you need).
Awesome, thanks!
Yup! Especially if we do chaining through shell pipes.
I was hoping to have extend add a random subtree at a leaf. Currently, it replaces a leaf with a duplicated subtree from elsewhere in the AST so that we don't have to deal with adding definitions to the AST.
Basically, stub prunes subtrees, extend appends subtrees, rotate transformes the shape by rotating triples, and litfuzz fuzzes literals/operators.
Maybe it's better to combine these so that we don't rely on chaining so much. Also, some of the reasoning is based on types (e.g. replace string expressions with string expressions) so building this into the walker could be useful.