Closed eaidova closed 3 months ago
this is temporal workaround until we can not use https://github.com/huggingface/optimum/pull/1825
falcon dummy input generator is not able to handle falcon-40b model (case with new_decoder_architecture and multi_query)
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
@echarlaix could you please take a look?
What does this PR do?
this is temporal workaround until we can not use https://github.com/huggingface/optimum/pull/1825
falcon dummy input generator is not able to handle falcon-40b model (case with new_decoder_architecture and multi_query)