-
如题
-
same title
-
Hi,
Seems that with Spring Boot 3.3.0 you have removed the single page variant of the reference documentation, as it now redirects to . Please restore it, it's necessary for
1. fast cross refere…
-
https://ciir-publications.cs.umass.edu/pub/web/getpdf.php?id=1302
-
Training for 50 epochs on CIFAR-10 with
```
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=1 train.py --num_workers 4 --batch_size 128 --epochs 50
```
and then boosting with…
-
Hi, I started a group of processes to perform allreduce operations. Each process started another thread to call `ncclCommAbort` at certain timepoint.
It is expected that all processes will eventua…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
- `llamafactory` version: 0.8.4.dev0
- Platform: Linux-5.4.0-26-generic-aarch64-with-glibc2.31
- Python…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
顺利部署,但微调时出错,报错如下:
Traceback (most recent call last):
File "/glm2/code/ChatGLM2-6B-main/pt…
-
I have to calculate the ETA for finishing training often enough that I think it should be a feature.
How about we log the ETA along `elapsed time per iteration`?
This is just current `elapsed_ti…
-
I'm creating this example to document a simple tests that creates and accesses a simple dataset in Parallel-HDF5, under Unify.
The original example is taken from the HDF5 website: https://support.h…