issues
search
arcee-ai
/
DAM
30
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
RuntimeError During Merging Process Possibly Due to Shared Memory Tensors
#41
SolshineCode
opened
3 hours ago
1
Option to record ALL loss metrics, independent of usage
#40
ElliotStein
closed
1 week ago
0
implement weighted overlap loss function
#39
thomasgauthier
closed
2 weeks ago
0
Llama3 datasets
#38
shamanez
closed
1 month ago
1
Tidy up for final runs
#37
ElliotStein
closed
1 month ago
0
Updated the code and Readme with Thomas' hyperparams.
#36
shamanez
closed
1 month ago
0
fixed the MSE loss average.
#35
shamanez
closed
1 month ago
0
Defualt settings
#34
shamanez
closed
1 month ago
0
Trainable/Freezable Layer Norm , Embedding Coefficients and Seamless Logits Computation on-the-fly
#33
shamanez
closed
1 month ago
0
added the ability to randomly initialize co-effs
#32
shamanez
closed
1 month ago
0
minor cleanups
#31
shamanez
closed
1 month ago
0
tested the trainer after the modifcations
#30
shamanez
closed
1 month ago
0
Add base model kl loss
#29
shamanez
closed
1 month ago
0
add a base_model dataset if needed.
#28
shamanez
closed
1 month ago
0
ReadMe-update-and-minor-fix
#27
ElliotStein
closed
1 month ago
0
Changed the files names for the calrity
#26
shamanez
closed
1 month ago
0
Re-factored the forlder strcuture.
#25
shamanez
closed
1 month ago
0
refactored the code
#24
shamanez
closed
1 month ago
2
4 Updates: Loss function flexibility, WandB optional, clear GPU memory and save datasets easily
#23
ElliotStein
closed
1 month ago
0
Bring embeddings and Layer Norms to the play!
#22
shamanez
closed
1 month ago
0
new README
#21
shamanez
closed
1 month ago
0
change presets, fix random seed, clear GPU memory after training
#20
ElliotStein
closed
1 month ago
0
Seamless Switch Between On-the-Fly and Pre-Computed Logits
#19
shamanez
closed
1 month ago
0
added the ability to compute logits on the fly as well.
#18
shamanez
closed
1 month ago
0
minor update
#17
shamanez
closed
1 month ago
0
Exclude Padded Tokens from Loss Computation
#16
shamanez
closed
1 month ago
0
AR 153 save datasets to hf for easy re use
#15
ElliotStein
closed
1 month ago
0
added the ability to play around between non linearirty
#14
shamanez
closed
1 month ago
0
added the ability to use the base model's linear layers in merging.
#13
shamanez
closed
1 month ago
1
AR-151 Abstract hparam choices up to cmd line args. Add WandB logging
#12
ElliotStein
closed
1 month ago
1
[AR-152] Can we automate a TIES like process?
#11
shamanez
closed
1 month ago
0
Added the ability to switch between L1 and L2 regloss. Because L1 co…
#10
shamanez
closed
1 month ago
0
added the merged model saving to the Dam Trainer
#9
shamanez
closed
1 month ago
0
KL Divergence Loss vs. MSE
#8
shamanez
closed
1 month ago
0
added support for llama3
#7
shamanez
closed
1 month ago
1
Pplist shamane
#6
shamanez
closed
1 month ago
0
Parameter list
#5
shamanez
closed
1 month ago
0
pplist freezing and testing
#4
shamanez
closed
1 month ago
0
Refactored the code structure.
#3
shamanez
closed
1 month ago
0
cleaned the trainer so that we can direclty use a datasets with the top K logits.
#2
shamanez
closed
1 month ago
0
Distributed Training with PreMerged model and Deepspeed
#1
shamanez
closed
1 month ago
0