Default command errors because SDPA not supported for Pythia

rhaps0dy commented 3 months ago

Running the README command python -m sae EleutherAI/pythia-160m togethercomputer/RedPajama-Data-1T-Sample gives error:

ValueError: GPTNeoXForCausalLM does not support an attention implementation through torch.nn.functional.scaled_dot_product_attention yet. Please request the support f or this architecture: https://github.com/huggingface/transformers/issues/28005. If you believe this error is a bug, please open an issue in Transformers GitHub reposi tory and load your model with the argument attn_implementation="eager" meanwhile. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementa tion="eager")

The reason is that attn_implementation="sdpa". Would a PR to make this configurable be welcome?

laerdon commented 3 months ago

Running into this issue as well.

vasqu commented 2 months ago

Working on SDPA support over here https://github.com/huggingface/transformers/pull/31031

Should I notify you guys if it has been merged?

norabelrose commented 2 months ago

Working on SDPA support over here huggingface/transformers#31031

Should I notify you guys if it has been merged?

Yeah that would be nice, thanks.

vasqu commented 2 months ago

@norabelrose It has been added now in v4.42.0 of transformers.

EleutherAI / sae

Default command errors because SDPA not supported for Pythia #1