-
We are currently using TCP/IP sockets for cross-process communication. That has some problems, e.g. user's firewall can accidentally block Gradle process communication. Additionally also sharing Gradl…
-
### Checked for duplicates
Yes - I've already checked
It might be similar to https://github.com/NASA-AMMOS/slim/issues/66 but I'm reading that as managing `many workers -> many projects` instead…
-
### Overview
This Test Automation Plan provides a structured approach to automating the Portal using Cypress, an end-to-end testing framework. This plan focuses on creating an efficient, reliable, an…
-
### Describe the bug
I tried to work on multiple tabs simultaneously, while the chat contexts are stored with separate chat ids, i found the following issues:
1. The LLM context is not explicitly …
-
Sorry, I have some questions to ask:
1、If I set num_local_experts = 2, it means that every gpu has two experts? and the two expert parameters exist on the one gpu?
2、If I set num_local_experts = -2, …
luuck updated
1 month ago
-
Hello! Thank you very much for your excellent work, which enables the distributed running of large models on heterogeneous devices! I was wondering if this project supports Android devices. I am curre…
-
https://dhelix4ai.github.io/dhelix/?
Recent advances in Generative AI, especially in chatbots and text generation, have fueled the rise of LLM training. However, communication overhead from intra-l…
-
We currently need zeroed global memory buffers for cross-cta communication. Our current executor calls `at::zeros` to initialize this before each launch of our nvfuser kernel, adding a handful of micr…
-
@rokonec suggested this idea during our planning meetings. I'll leave it up to you to fill in this section with what we might be able to expect out of this, and what the path forward is here.
-
### 🐛 Describe the bug
I try to use the NCCL backend for cross-machine asynchronous P2P communication and find a problem with it. Here is how the communication is executed for each process (and you …