-
关键配置信息如下:
# model architecture
Arch:
name: ResNet101_vd
class_num: 2
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
epsilon: 0.…
-
When I fine-tune using Lora, the model's convergence effect is not good. The hyperparameters are set as follows:
--lora_enable True \
--deepspeed scripts/zero3.json \
--model_name_or_path …
yimuu updated
2 weeks ago
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
训练命令:
llamafactory-cli train \
--stage dpo \
--do_train \
--finetuning_type full \
…
-
> Most frontends should be implemented in very few lines of code, with many only being a single line of code. Shortening a frontend might involve adding a new function to our Experimental API, or exte…
-
Hi , I trained sbert with triplet loss with default distance_metric --> Euclidian distance got descent embeddings and cosine score between relevant/irrelevant sentences. But when i tried with TripletD…
-
hi the project is cool but for lower loss i suggest to use a "cosine total steps" similar to the total steps, for example if you are doing 30k steps you should add this code
`pipeline.train_config.…
-
Hello,
I'm hoping to use using KeOps in order to save some memory in my PyTorch model (thanks for all the work developing this tool!)
The line `pykeops.test_torch_bindings()` works fine.
I fi…
-
It seems that you used different criteria during training and testing as the code below shows:
IN TEST:
scores = np.dot(vecs.T, qvecs)
IN TRAIN:
dif = x1 - x2
D = torch.pow…
-
Hi! Thanks for your wonderful work! I have a couple of questions:
1. Could you please share the details of the test dataset you used in the paper to evaluate the reference-based restoration?
2. …
-
# Basic Information
- Authors: Xintong Han, Zuxuan Wu, Phoenix Huang, Xiao Zhang
- Date: Aug, 2017
- Published By: arXiv
## Link
https://arxiv.org/pdf/1708.01311.pdf
# Overview
![image](htt…