Closed guangy10 closed 2 days ago
Be able to construct and load a model like:
model = AutoModelForCausalLM.from_pretrained( hf_model_repo, attn_implementation="sdpa", generation_config=GenerationConfig( use_cache=True, cache_implementation=cache_implementation, max_length=max_cache_len, cache_config={ "batch_size": batch_size, "max_cache_len": max_cache_len, }, ), )
See additional context in #32253
This feature request is to support torch.export(), and ensure the model is exportable in a way that can be further lowered and run in ExecuTorch with performance out-of-the-box.
torch.export()
TBD
PR is published: https://github.com/huggingface/transformers/pull/32830
Feature request
Be able to construct and load a model like:
See additional context in #32253
Motivation
This feature request is to support
torch.export()
, and ensure the model is exportable in a way that can be further lowered and run in ExecuTorch with performance out-of-the-box.Your contribution
TBD