issues
search
DRSY
/
EMO
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
114
stars
14
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
怎么用于普通分类任务?
#13
justajustin
closed
3 months ago
0
是否会导致模式坍塌?
#12
980202006
closed
7 months ago
4
Is there a good way to initialize cost matrix when pretraining from-scratch?
#11
DaehanKim
closed
7 months ago
2
Fix typo in README
#10
tomaarsen
opened
8 months ago
0
Is it possible for you to cite "Learning with a Wasserstein Loss" in the camera-ready version?
#9
YouJiacheng
closed
9 months ago
1
emo_loss 会变成nan
#8
zzf-damon
opened
9 months ago
5
关于emo loss的疑问
#7
Vincent-Poirot
opened
11 months ago
1
gt_q = (q_grad * one_hot).detach()
#6
chenxu001
closed
11 months ago
3
代码中提供了三种DEMD和MLE的方式, 请问论文中使用的是哪一种呐?
#5
oyjxer
closed
11 months ago
1
分布式多机训练, loss 训练 300 step 后会变成负数
#4
jiaruipeng1994
opened
11 months ago
3
Add return_dict option to be compatible to llama
#3
xufangzhi
closed
1 year ago
0
update README.md
#2
eltociear
closed
1 year ago
0
请问训练的时候loss上升是正常现象吗
#1
lichen914
closed
1 year ago
7