-
Hey there,
Thank you for making this implementation of the selective attention paper, other than the simple integration in x-transformers, I think this is the first public replication of that paper…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…
-
**describe the feature**
could be:
- icloud style that sync your data somewhere as an archive, encrypted
- AI pruning, only keeping high level abstractions of your data (think chatgpt memories)
…
-
**Is your feature request related to a problem? Please describe.**
We should provide an interface for structural pruning methods, such as N
pruning based on weight magnitude or methods like Wanda,…
-
# Feature Stage
- [ ] Feature released
- [ ] Tracking feedback
- [ ] Installation API (HELM) - TBD
- [ ] Enabled By Default - TBD
# Getting help
If you are having an issue with the feature…
-
Hey,
thanks for sharing your exciting work!
I have a question regarding a minor thing in the memory pruning logic.
As far as I understood, the weights for finding relevant features are masked t…
-
Nodes can currently run with a partial history (by loading from a checkpoint), but it's not a fully fledged feature.
* Add API to query what blocks a node has available
* Make sure RPCs fail grace…
-
Currently, each conversation turn includes the full chat history, increasing token usage. I propose omitting intermediate tool-related messages, replacing them with placeholders like "DELETED FOR CONV…
-
**Step 2**: Prune the models (SECOND as example, on 8 GPUs):
**How long does it take for you to fall on the dataset once during the pruning iteration process?**
-
## Background
**[Neural Sparse](https://opensearch.org/docs/latest/search-plugins/neural-sparse-search/)** is a semantic search method which is built on native Lucene inverted index. The documents…