-
- [ ] [[Announcement] Generation: Get probabilities for generated output - 🤗Transformers - Hugging Face Forums](https://discuss.huggingface.co/t/announcement-generation-get-probabilities-for-generated…
-
Thank you for sharing the awesome work on GitHub.
I want to train LSTR on my custom dataset for offline inference. So I modified the code:
https://github.com/amazon-science/long-short-term-transfo…
-
Hi,
Thank you for your awesome work! I wonder whether there is an API for text encoder that takes in texts and outputs text features, like https://huggingface.co/docs/transformers/model_doc/owlvit…
-
in load_pretrained_model
model = CambrianLlamaForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3531, in from_pretrained
) =…
-
Hi there,
Your work looks really awesome!
Could you detail the usage of your implementation?!
I would use this code to "clean" my input data before using it to train a transformer model.
Tha…
-
Hi, I can't use the transformers because windows won't let me use the '|' character in file/folder names, not sure if it's a windows only issue or it affects other Operating systems also but do you kn…
-
## ❓ Questions and Help
Hi @KaihuaTang, thanks for your awesome work! I am wondering where I can find the details or an illustration of the Transformer model, as I couldn't find it anywhere in this…
-
Hi! I've been trying to porting nanoGPT to Rust with dfdx. The `transformer` module is awesome! but it seems an important trick is missing, which is the attention mask in `TransformerDecoderBlock`. I …
-
Hi - awesome work! I am trying to understand ? I couldn't find a paper - only a reference to https://github.com/kingoflolz/mesh-transformer-jax. Is this right? Am I understanding that it is bascially …
-
Thanks for your awesome work! There is a small problem: when I fine-tune long_llama with gradient_checkpointing, it raises an error:
![image](https://github.com/CStanKonrad/long_llama/assets/55051961…