-
Currently hostfile updater updates every 60 seconds (or fixed time period) which is not a good design given touching the hostfile clears the local DNS caches. Hence, we would like to make it update on…
-
**Describe the bug**
This issue occurs on a SLURM cluster where worker nodes equipped with multiple GPU's are shared amongst users. GPU's are given slot number assignments (for example, on a node wit…
-
I have use following code for ssh connection to HP Procurve switch
import paramiko
def readAndConnect(user, password, tftpRoot, tftp, hostFile, continueBannerSwitchListExist="no"):
```
#open switch…
-
大佬们好,我在用openi数据集(大概6500条数据)对VisualGLM进行微调之后,检测模型的推理能力的时候,出现以下情况
![c413986eb8372045aa10b401b63884c](https://github.com/THUDM/VisualGLM-6B/assets/56297762/5ea46e26-b074-43b8-ac01-c2a3935a0bac)
是…
-
### Describe the issue
Issue:
We collect a large-scale instruction dataset, and want to use muti-nodes training. When using the following script, the traing time is too slow and no log about time.
…
-
We want for a user to be able to view their data by going to codeclimbers.local
Add an entry to their hostfile, if it doesn't already exist, when the user runs codeclimbers start, that adds the entry…
-
**Describe the bug**
When I want to use ```--autotuning run``` args to training on single node and 2 RTX 6000 Ada GPUs, it returns ```localhost: Permission denied, please try again.```.
**To Repro…
-
*pduveau:*
Router model : Netgear WNDR3700v1
I upgraded to 19.0.7 yesterday. Since I have the follow issue :
all static leases can't be resolved (25 hosts).
In system.log I found this logs eac…
-
##### SUMMARY
Be able to run the playbook with the --check option to see changes before they are applied. We would like to determine changes made by to be made by the playbook before they are applied…
-
全参微调loss可以下降到0.03,效果相对较好;但是lora微调loss在1.2-1.5波动,效果也不好。
`deepspeed --hostfile=$hostfile fine-tune.py --report_to tensorboard --data_path "data/ysx_25588.json" --model_name_or_path ".…