issues
search
carbonscott
/
exp-maxie
0
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
See if we can overfit a small subset.
#2
carbonscott
opened
2 hours ago
1
MFU calculation with `model.parameters()` might not be correct when using sharding.
#1
carbonscott
opened
4 weeks ago
0