-
**Description**
I'm encountering an error when installing ludwig[distributed] in a Jupyter Notebook environment running on a Dataproc cluster. The installation seems to proceed normally until it atte…
-
### System Info
```shell
Image: vault.habana.ai/gaudi-docker/1.17.1/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest
harware: Habana Labs Gaudi HL205 Mezzanine Card with HL-2000 AI Training …
-
#### Describe the bug
Seemingly impossible to set values in the mimir-distributed chart to set the memory and CPU requests of the various memcached caches that mimir makes use of (chunks, results, etc…
-
### Your current environment
GPU : H100 80G *2
Model : Llama 3.1 70B
Model Params:
~~~
env:
- name: MODEL_NAME
value: /mnt/models/models--meta-llama--llama-3-1-70b-i…
-
Hi team,
Sharing some observation for potential bugs as below:
- https://github.com/aws-neuron/neuronx-distributed/blob/4f954715f39b2cc9e628ded79274957401bea086/examples/inference/llama2/neuron_…
-
### Problem:
If two people want to crack the same data, they have to do it individually. That's a lot of effor.
### Solution:
Kademlia DHT for ciphertext-plaintext pairs, with a privacy argument …
-
A way to keep cache synced across all instances of the portal which are up and running.
Probably implementing this with Memcache
-
I would appreciate any insights/thoughts about this, before we/or someone else contributes an implementation.
### Feature Request
Support query plan cache key support for distributed cache impleme…
-
Currently, when a new microbenchmark is added on master it will not be part of of the weekly run until the microbenchmark is on both a release version and master. The reason for this is the weekly job…
-
尝试了很多此,成功安装。环境为:win10,python3.111,torch2.4.1,cuda12.4
***使用CMD***
powershell会失败,不清楚原因。
将储存库clone到本地,然后运行cmd,进入仓库目录
执行
git checkout apex_no_distributed
执行
pip install -v --no-cache-dir ./
终于成功安装