google / syzkaller

syzkaller is an unsupervised coverage-guided kernel fuzzer
Apache License 2.0
5.32k stars 1.21k forks source link

sys/fuchsia: richer types in fidl descriptions #727

Open dvyukov opened 6 years ago

dvyukov commented 6 years ago

Currently fidl descriptions only capture physical types of fields. We need to capture semantic meaning of fields somehow. For example:

// fidl_io.txt
fuchsia_io_DirectoryOpenRequest {
...
    pathInLine  fidl_string
..
} [packed]

// fidl_net.txt
fuchsia_net_IPv4AddressInLine {
    addrInLine  array[int8, 4]
} [packed]

path needs to use syzkaller filename type, and ipv4 addr needs to be translated to syzkaller ipv4 type that can generate more meaningful addresses. One option would be to annotate fidl descriptions with semantic information and then use it to generate syzkaller descriptions.

dvyukov commented 6 years ago

@flowerhack @dokyungs FYI

dvyukov commented 5 years ago

@pascallouisperez @mvanotti FYI this may be a relatively low-hanging way of improving fidl testing coverage. But we did not figure out how to attach this meta info to descriptions.

pascallouisperez commented 5 years ago

Using attributes for that might be the most direct, e.g.

struct Ipv4Address {
    [syzkaller = "ipv4"]
    array<uint8>:4 addr;
};

Down the road, we plan to support syntactic extensions such that attributes can be placed in separated files. Hypothetically, this would become a syzkaller.fidl file which lives in a separate hierarchy, and can be conditionally included:

extension Ipv4Address {
    [syzkaller = "ipv4"] addr;
};
dvyukov commented 5 years ago

So everything we need already exists. Nice! I am somewhat worrier about out-of-line annotations, they tend to go our of sync. Also, can you think about any other potential consumers for richer type info? E.g. saying that a string is actually a file name looks pretty generic. If there other potential consumers (user-space fuzzers too?), then it can make sense to consider doing something non-syzkaller-specific, e.g.:

struct Ipv4Address {
    [subtype = "ipv4"]
    array<uint8>:4 addr;
};

But we may also want to start without over-generalizing it. Changing this later should be easy enough too.

pascallouisperez commented 5 years ago

I am somewhat worried about out-of-line annotations, they tend to go our of sync.

Do you mean the extension, or simply using attributes in the first place? Because this meta-data is quite specific, it's hard to motivate having this more first class in the language. We're looking at cap'n proto annotations as inspiration in terms of a 'meta language' to define a schema over meta data.

As for extensions, they'd be compiler checked just like attributes to avoid things getting out-of-sync. It would simply allow having this meta-data concern closer to those concerned (e.g. in this case, the syzkaller build).

Would love to hear your thoughts on this.

But we may also want to start without over-generalizing it. Changing this later should be easy enough too.

That would be my recommendation, start specific, then see if it generalizes.

dvyukov commented 5 years ago

I meant just the "in separated files" part. Say, more fields/types added, most likely they won't have annotations in separate files added.

pascallouisperez commented 5 years ago

I meant just the "in separated files" part. Say, more fields/types added, most likely they won't have annotations in separate files added.

Yes, my assumption is that the library authors aren't the ones providing the annotations, so it's fine to have that separate in this case.