We should limit the number of fields returned, if the key is actually data e.g. {"<some_id>": true}, we probably should not return it. Ideally, we would detect which fields are relevant and which are not.
+1 on scoring the fields and returning a top K. The best signal would probably:
in schema or not
if we can grab that info, how many docs have the field or not
field name length
does the field name contain weird characters or not, etc?
Quoting @PSeitz and @fulmicoton: