-
设备为两台linux,每台2张A100 40G显卡:A100(40G) * 2
训练命令如下:主节点命令为CUDA_VISIBLE_DEVICES=0,1 NNODES=2 NODE_RANK=0 NPROC_PER_NODE=2 MASTER_ADDR=127.0.0.1 swift sft --model_type qwen1half-7b-chat --model_id_or_path /…
-
## Motivation
Currently, the Burn deep learning framework in Rust lacks support for 0-dimensional tensors (scalars). Adding support for 0-dimensional tensors would enhance the framework's capabilit…
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports.
…
-
## Environment
* **IntelliJ Rust plugin version:** 0.4.171.4656-221
* **Rust toolchain version:** 1.61.0 (fe5b13d68 2022-05-18) x86_64-pc-windows-msvc
* **IDE name and version:** CLion 2022.1.2 (…
-
### Current Behavior
Looks like the certification has expired for https://loopygo.app/
### Expected Behavior
Should open the website
-
is there any solution?
-
bash training/finetune_RedPajama-INCITE-Chat-3B-v1.sh
My configurations changes as below:
--lr 1e-5 --seq-length 2048 --batch-size 8 --micro-batch-size 1 --gradient-accumulate-step 1 \
--num-layers…
-
Below are some places to look at to see what kinds of topics should be covered. Comment below for what you think should be covered in the series of videos. Topics can range from simple to complex.
…
-
Hi, I am new to deep learning, and thus may not understand your paper fully, hope is all right with you. I tried to implement the batch_hard using Inception_resnet_v1 and trained from scratch using ma…
cptay updated
6 years ago
-
This is good job. However, we always use BPRLoss rather than HingeLoss in pairwise learning to rank since the margin of HingeLoss is hard to tune. So I wonder whther you have tried the BPRLoss?