Unable to run generation tests for Mamba & Jamba models

System Info

transformers version: 4.41.0.dev0
Platform: Linux-5.15.0-1045-aws-x86_64-with-glibc2.31
Python version: 3.10.9
Huggingface_hub version: 0.23.0
Safetensors version: 0.4.2
Accelerate version: 0.29.2
Accelerate config: not found
PyTorch version (GPU?): 2.2.2+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): 0.7.0 (cpu)
Jax version: 0.4.13
JaxLib version: 0.4.13
Using GPU in script?: No
Using distributed or parallel set-up in script?: No

Who can help?

@gante @zucchini-nlp

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

See #30826

test_assisted_decoding_matches_greedy_search_0_random is forcibly skipped for Jamba because it's necessary to unset _supports_cache_class to resolve failing tests on main.

test_assisted_decoding_matches_greedy_search_0_random appears to pass for Mamba, but this is because all_generative_models is not set in the model tester

Expected behavior

Either test_assisted_decoding_matches_greedy_search_0_random can be run for both models with _supports_cache_class unset or it's not necessary to have _supports_cache_class unset for Jamba

huggingface / transformers