-
Hi!
i was running few experiments and noticed that GP is extremely hight in first few 100 steps.
GP > 60000, and then gradually going down to around GP = 20
is it normal behaviour? In my previo…
-
Hi, on training, I have deformed and skinned garment from snug model and I fixed collision as post processing on the predicted garment. Then I am using loss functions to improve learning.
So, I want …
-
Hi!
When comparing the estimations of random intercepts in a simple simulation between `mgcv` and `deepregression`, I obtain different results. The results with `deepregression` tend to be more sh…
-
@primepake Hello, thanks for your nice work. I have encountered some difficulties in training on my own dataset (**followed your data preparation suggestions**) using your sharing code recently.
Whi…
-
```
[LightGBM] [Fatal] Socket send error, code: 104
distributed.worker - WARNING - Compute Failed
```
Full logs:
```
2021-03-15T22:41:00.2549100Z ============================= test session st…
-
### 场景
相关性识别
### 样本示例
{"instruction": "query:苹果手机\n title:iphone 6s,99新 \n请判断上述query和title是否相关?", "input": "", "output": "是"}
### sft.sh 参数
nproc_per_node=1
base_path="xxx"
train_data="xxx"
v…
-
Has anyone tried downscaling the K and/or Q matrices for repeated layers in franken-merges? This should act like changing the temperature of the softmax and effectively smooth the distribution:
**H…
-
- Código: https://github.com/garynlfd/kfc
- Paper: https://arxiv.org/abs/2309.10641
Encontrei-o enquanto procurava código para #49.
-
### Reminder
- [X] I have read the README and searched the existing issues.
### Reproduction
CUDA_VISIBLE_DEVICES=0,1 python src/train_bash.py can run sucessfully.
deepspeed --num_gpus 2…
-
## Summary
Forward columns of your dataset directly to a custom objective function.
## Motivation
This is useful for semi-parametric models like poisson process regression where y | x, t ~ …