Vance0124 Token-level-Direct-Preference-Optimization issues - Githubissues

Vance0124 / Token-level-Direct-Preference-Optimization

Reference implementation for Token-level Direct Preference Optimization(TDPO)

Apache License 2.0

110 stars 12 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Inquiry About KL Divergence in ICML 2024 Paper

#8 junkangwu closed 2 months ago
2
Some questions about the paper

#7 robin087 closed 1 week ago
1
Can you train DPO directly? Using open-source base models.

#6 tcxia opened 2 months ago
0
How do you go about evaluating ALIGNMENT(accuracy) and DIVERSITY(entropy)? Is there a code available?

#5 wangxu0820 opened 4 months ago
0
How to eval the models

#4 Yeeesir closed 4 months ago
1
How about the loss curve? especially when converge

#3 LuckerYi closed 4 months ago
1
Some questions about the code

#2 yuchen814 closed 4 months ago
3
Update README.md

#1 fiberleif closed 6 months ago
0