[Bug] Passing through kwargs to `TransformersTokenizer`

riedgar-ms commented 3 months ago

This is a fix for #909 . It ensures that any argument common to both AutoModelForCausalLM.from_pretrained() and AutoTokenizer.from_pretrained() will be passed from the former to the latter when a TransformersModel is created. The original issue was only about trust_remote_code but there are a few other arguments related to accessing Hugging Face which probably ought to be forwarded as well.

If there are LLM specific arguments which need to be shared, then users should instantiate the tokeniser separately, and pass it in.

codecov-commenter commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 58.98%. Comparing base (567174a) to head (0bcab89).

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #922 +/- ## ========================================== + Coverage 57.08% 58.98% +1.90% ========================================== Files 64 64 Lines 4705 4711 +6 ========================================== + Hits 2686 2779 +93 + Misses 2019 1932 -87 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Harsha-Nori commented 3 months ago

LGTM :) Thanks for investigating and finding the common kwargs Richard!

hudson-ai commented 3 months ago

Looks good to me too :)

guidance-ai / guidance

[Bug] Passing through kwargs to `TransformersTokenizer` #922

Codecov Report