-
[Balancing Pipeline Parallelism with Vocabulary Parallelism](https://arxiv.org/abs/2411.05288) introduces a way to handle vocabulary scaling together with PP. While context parallelism splits the sequ…
-
We know that `Transformer_Engine` has support for FP8 training with `data parallel + tensor parallel + sequence parallel`, https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/a…
-
One todo would be to investigate adding parallelism to secret detection because right now it's really slow on real-life binlogs, even moderately sized. Takes over two minutes on a 5 MB binlog I use of…
-
I've noticed that when passing `-j8` to weeder, it often still uses only one core. See this graph from threadscope
![image](https://github.com/user-attachments/assets/9997e263-af4c-43e0-891d-6538f12…
-
`image` only provides the `rayon` compile-time feature for controlling parallelism. There are no runtime controls exposed, which means there isn't an obvious way to control things like:
1. Whether …
-
See https://github.com/orgs/scipp/discussions/3595#discussioncomment-11222220
Investigate possible performance benefits of using multiprocessing to speed up the `scipp.curve_fit` routine.
Questi…
-
I want to use tensor parallelism with ouroboros, but I do not find the config to start the tensor parallel, can you give me an example?
-
I want to use tensor parallelism with CS-drafting, but I do not find the config to start the tensor parallel, can you give me an example?
-
## 🚀 Feature Request
Supporting TP and SP seems quite easy to do with the `replication parameter:
```
replication = tp * sp
```
I have tried various ways to enable PP without success (unexp…
-
I want to use tensor parallelism with your work, but I do not find the config to start the tensor parallel, can you give me an example?