google-research / lm-extraction-benchmark

Apache License 2.0
271 stars 19 forks source link

Restrictions on usage of model #7

Closed akul-goyal closed 2 years ago

akul-goyal commented 2 years ago

Hi,

Just wanted to confirm that there are no restrictions on how we query the model ie: the model can only give the confidence on the top n predicted tokens, etc.

heitikei commented 2 years ago

@akul-goyal @pluskid There are unnecessary restrictions on the query .... creating unnecessary tokens. It is the first step towards a universal grammar for query programming.

Disable the following (and tidy upstream coding):

https://github.com/google-research/lm-extraction-benchmark/blob/a36cbfab1f7719e71667aedc7fbf7f27b57ab823/baseline/simple_baseline.py#L69

I remain interested. Choosing to organise your dataset before your design the query constrains all possible outcomes to the defining set. The original query therefore will be unable to adapt to changing data arrays.

carlini commented 2 years ago

Correct; you can query however you like.

@heitikei I'm sorry I don't understand what you're trying to say. Can you please raise a new issue if there's some problem?