bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
710 stars 183 forks source link

Adding additional optional args for decoding flags and AutoModel kwargs to support models like ReplitLM #115

Open madhavatreplit opened 1 year ago

madhavatreplit commented 1 year ago

Why

We require the ability to configure the tokenizer.decode call, as well as model args in the AutoModelForCausalLM.from_pretrained to support models like ReplitLM.

What changed

We add two input arguments with safe default behaviour to the main.py script:

  1. clean_up_tokenization_spaces : bool

    • this boolean flag is passed to tokenizer.decode to prevent tokenization spaces from being cleaned up. This flag affects spacing and therefore syntax in generated code with certain tokenizers such as the ReplitLM tokenizer.
    • defaults to True, stores False
  2. automodel_kwargs: json.loads, aka. a "stringified" JSON

    • a "stringified" JSON that sets what default config values should be overriden in this harness to reproduce results.
    • updates default init config key-values by being passed into the AutoModelForCausalLM.from_pretrained as kwargs. See the logic of why and how this works here in the transformers documentation.
    • defaults to empty stringified JSON: "{}".

Rollout

[x] This is fully backward and forward compatible