-
### System Info
compute_environment: LOCAL_MACHINE
deepspeed_config: {}
distributed_type: MULTI_GPU
fsdp_config: {}
machine_rank: 0
main_process_ip: null
main_process_port: null
main_trainin…
-
E2e test is down. Reason is straightforwad that server report 503 issue and I did some check and notice this has been tracked in torch community.
As the patch is only available on master and ther…
-
Hi. I was wondering how to get the confusion matrix during the training and saving the results (without using test.py). Here is my code to train DetectoRS on a custom dataset:
````
# Applying Dete…
-
### Version
1.31.0
### Describe the bug.
When adding `rand_augment`, the program crashes
### Minimum reproducible example
```python
import os
import os.path as osp
import lmdb
import dill
import n…
-
Hello,
I've set up a Conda environment, and here are the version details of some important libraries (they differ from the versions specified in the ReadMe file):
- Python 3.8.12
- PyTorch 1.13.1 (…
-
修改tensorboard的日志目录后,当微调step执行到save_step时,报错FileNotFoundError: [Errno 2] No such file or directory: '/app/work_dirs/chatglm2_6b_qlora_lawyer_e3_copy/20240514_035914/vis_data/eval_outputs_iter_499.txt'.…
-
**Describe the bug**
I was trying to use tsfresh and began by installing it with pip install tsfresh command. During the command execution I got an error installing matrixprofile. I really would love…
-
# VDS SWMR - Ok for Round Robin
It looks like the new hdf5 1.10 virtual dataset features works like this
1. Define the mapping source datasets and the view datasets in the dataset creation propert…
-
Hello all,
I've been trying to combine dask-ml's tools in the most vanilla way that I can think of and they don't seem to fit together (no pun intended).
Specifically, I want to both train model…
-
**I have been able to run successfully `pip install --editable .` and all steps before that**
Getting error on this step:
(mmf) C:\Users\user\Desktop\naveen\workspace\mmf>pytest .\tests
======…