-
**Unable to fine-tune Llama2 70B with FSDP**
I am trying to fine-tune Llama2 70B model on a dataset, with TP=4, PP=8 it is working fine. But with FSDP on 6 nodes it is failing with below error
```…
-
### Is there an existing issue for this bug?
- [X] I have searched the existing issues
### Required Troubleshooting Steps
- [X] I have followed these troubleshooting steps
- [X] I have tried both v…
-
https://www.pdai.tech/md/about/me/blog-question.html
-
### Describe the bug
Hello,
I'm using the Fabric8 Kubernetes Client (version 6.12.0) to automatically generate Java classes from a CRD. The generation process successfully creates the individual r…
-
I am trying to communicate from an F# script with an actor that runs in an Akka.NET cluster. The script references a few NuGet packages, i.e.
```
#r "nuget: Akka"
#r "nuget: Akka.Remote"
#r "nug…
-
UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
device: torch.device = torch.device("cpu"),
Models: ['llavamed']
-
### Feature Description
when I wanted to write a new e2e test, I noticed some of the e2e test is duplicated for each e2e.
this is not a good practice and might become a pain once we have many tests.…
-
Motivation:
1) Complete overhaul the API/implementation of "FieldCache" type things...
a) eliminate global static map keyed on IndexReader (thus
eliminating synch block between completley …
-
### Work Environment
| Question | Answer
|---------------------------|--------------------
| OS version (server) | Ubuntu 20.04
| TheHive version / git hash | 4.0.0
| Packa…
-
In /data/projects/glygen/generated/datasets/unreviewed/protein_homolog_clusters.csv there appears to be issues with adding these canonical accessions. One such example of this is with Q9VEJ1-1;
**…