model-parallel Search Results

1000+ results
for model-parallel

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

MeteoSwiss-APN/dawn #836

Parallel Model

- [x] Formalize the model - [ ] Lowering separates `vertical regions` to `MultiStages` - [ ] Lowering separates `statements` to `Stages` - [ ] Rework Stage Splitting Pass (now pure optimisation) …

twicki updated 4 years ago
1
vllm-project/vllm #9206

[Feature]: Simple Data Parallelism in vLLM

### 🚀 The feature, motivation and pitch It is common to have a scenario where folks want to deploy multiple vLLM instances on a single machine due to the machine have several GPUs (commonly 8 GPUs). …

simon-mo updated 1 month ago
7
xdit-project/xDiT #262

FLUX with SP 并行生成图像差异

### 问题描述固定 seed 测了下，为了确认 seed 是固定的，先重复运行了多卡脚本，确保每次图像不变。在这个条件下，不同卡数生成的图像： | | image | |--------------------------------|-------| | flux_result_dp1_cfg1_ulysses1_…

lixiang007666 updated 2 weeks ago
7
vllm-project/vllm #9469

[Bug]: I want to integrate vllm into LLaMA-Factory, a transf…

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N…

takagi97 updated 1 month ago
6
BerriAI/litellm #6783

[Bug]: Race condition: Wrong trace_id sent to Langfuse when …

#### What happened When using LiteLLM with Redis caching enabled and making parallel calls, incorrect trace_ids are being sent to Langfuse, despite langfuse_context.get_current_trace_id() returning…

yuriykuzin updated 6 days ago
1
huggingface/optimum-neuron #721

training loss while fine-tuning llama 3.1 with lora is very …

### System Info ```shell using Huggingface AMI from AWS marketplace with Ubuntu 22.04 optimum-neuron 0.0.25 transformers 4.45.2 peft 0.13.0 trl 0.11.4 accelerate 0.29.2 torch 2.1.2 ``` …

anilozlu updated 3 weeks ago
3
NVIDIA/NeMo #10966

Converting Mamba to tp4: RuntimeError: The size of tensor a …

**Describe the bug** I am trying to convert the default `mamba.nemo` file (I converted [form huggingface](https://huggingface.co/nvidia/mamba2-8b-3t-4k/tree/main) .pt to .nemo) to have `tensor_parall…

zixianwang2022 updated 2 days ago
1
pytorch/pytorch #140069

DTensor support for fused qkv matmul

### 🚀 The feature, motivation and pitch For transformer architecture (for example https://github.com/pytorch-labs/gpt-fast/blob/main/model.py#L195-L211) it tends to be most performant to merge the qk…

HDCharles updated 2 weeks ago
1
vllm-project/vllm #9240

Questions about the inference performance of the GPTQ model

**Why is it that when using a quantitative model for inference, the TTFT optimization is not obvious, but the overall inference efficiency is improved a lot? At the same time, the inference efficiency…

Rssevenyu updated 1 month ago
4
microsoft/nnscaler #4

Request for Communication Performance Evaluation and Subset …

First of all, thank you for your amazing work on the nnScaler project. It has been incredibly inspiring, and I’ve been learning and using the contents from this repository in my own work. I have a fe…

huangzx02 updated 1 month ago
4

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for model-parallel

1000+ results
for model-parallel