arcee-ai DAM issues - Githubissues

arcee-ai / DAM

30 stars 4 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

RuntimeError During Merging Process Possibly Due to Shared Memory Tensors

#41 SolshineCode opened 3 hours ago
1
Option to record ALL loss metrics, independent of usage

#40 ElliotStein closed 1 week ago
0
implement weighted overlap loss function

#39 thomasgauthier closed 2 weeks ago
0
Llama3 datasets

#38 shamanez closed 1 month ago
1
Tidy up for final runs

#37 ElliotStein closed 1 month ago
0
Updated the code and Readme with Thomas' hyperparams.

#36 shamanez closed 1 month ago
0
fixed the MSE loss average.

#35 shamanez closed 1 month ago
0
Defualt settings

#34 shamanez closed 1 month ago
0
Trainable/Freezable Layer Norm , Embedding Coefficients and Seamless Logits Computation on-the-fly

#33 shamanez closed 1 month ago
0
added the ability to randomly initialize co-effs

#32 shamanez closed 1 month ago
0
minor cleanups

#31 shamanez closed 1 month ago
0
tested the trainer after the modifcations

#30 shamanez closed 1 month ago
0
Add base model kl loss

#29 shamanez closed 1 month ago
0
add a base_model dataset if needed.

#28 shamanez closed 1 month ago
0
ReadMe-update-and-minor-fix

#27 ElliotStein closed 1 month ago
0
Changed the files names for the calrity

#26 shamanez closed 1 month ago
0
Re-factored the forlder strcuture.

#25 shamanez closed 1 month ago
0
refactored the code

#24 shamanez closed 1 month ago
2
4 Updates: Loss function flexibility, WandB optional, clear GPU memory and save datasets easily

#23 ElliotStein closed 1 month ago
0
Bring embeddings and Layer Norms to the play!

#22 shamanez closed 1 month ago
0
new README

#21 shamanez closed 1 month ago
0
change presets, fix random seed, clear GPU memory after training

#20 ElliotStein closed 1 month ago
0
Seamless Switch Between On-the-Fly and Pre-Computed Logits

#19 shamanez closed 1 month ago
0
added the ability to compute logits on the fly as well.

#18 shamanez closed 1 month ago
0
minor update

#17 shamanez closed 1 month ago
0
Exclude Padded Tokens from Loss Computation

#16 shamanez closed 1 month ago
0
AR 153 save datasets to hf for easy re use

#15 ElliotStein closed 1 month ago
0
added the ability to play around between non linearirty

#14 shamanez closed 1 month ago
0
added the ability to use the base model's linear layers in merging.

#13 shamanez closed 1 month ago
1
AR-151 Abstract hparam choices up to cmd line args. Add WandB logging

#12 ElliotStein closed 1 month ago
1
[AR-152] Can we automate a TIES like process?

#11 shamanez closed 1 month ago
0
Added the ability to switch between L1 and L2 regloss. Because L1 co…

#10 shamanez closed 1 month ago
0
added the merged model saving to the Dam Trainer

#9 shamanez closed 1 month ago
0
KL Divergence Loss vs. MSE

#8 shamanez closed 1 month ago
0
added support for llama3

#7 shamanez closed 1 month ago
1
Pplist shamane

#6 shamanez closed 1 month ago
0
Parameter list

#5 shamanez closed 1 month ago
0
pplist freezing and testing

#4 shamanez closed 1 month ago
0
Refactored the code structure.

#3 shamanez closed 1 month ago
0
cleaned the trainer so that we can direclty use a datasets with the top K logits.

#2 shamanez closed 1 month ago
0
Distributed Training with PreMerged model and Deepspeed

#1 shamanez closed 1 month ago
0