-
### Feature request
It would be nice to combine the benefits of flex attention and 4d masking.
Perhaps the llama model could be a first case, allowing arbitrary 4d masks to be handled via an effic…
-
Hi, Thanks to your wonderful work on this framework!
I want to develop my own model according to my idea. Previously, I usually use IDE like Pycharm( where I could debug line by line) to help me fin…
-
### Describe the bug
XFormer will fail when passing attention mask with its last dimension not being a multiple of 8 (i.e. key's sequence length) under bfloat16. This seems to be because xformer ne…
-
### Please check if a similar issue has already been reported.
- [X] I checked this type of issue has never been reported.
### Please check you're using proper versions.
- [X] I checked all o…
-
### Library name
Azure.Storage.Blob
### Please describe the feature.
To export ACLs of blobs from ADLS efficiently, we hope to use list blobs API to do in batch.
https://learn.microsoft.com/…
-
### Project Name
Talk to Your Docs with AI!
### Description
# ☕️ Chat with AI (and optionally your document)
This Streamlit application allows users to chat with AI and optionally upload d…
-
### Project Name
WebsiteGPT: AI-Powered Website Docs
### Description
AI-powered web application that revolutionizes how businesses interact with their website content and documentation. By leverag…
-
**What would you like to be added**:
Right now we can download model weights from model hub directly, but each time we start/restart a pod, it will downloading the model weights again. Without …
-
In our MPI implementation node A has to call `comm.send` and node B has to call `comm.recv` for a message to be successfully communicated from node A to node B. In contrast, gRPC only requires calling…
-
That does not require duplicating the order of the Taylor series.