A = pd.read_csv('outliers.1.csv', index_col=None)
As = []
for tag,G in A.groupby('scenario'):
reg = G.reference < G.new # those that got higher (worse)
if tag =='train_tokens_per_second':
reg = reg.apply(lambda x: not x) # these are those that are worse if lower
As.append(G.loc[reg])
A = pd.concat(As)
@achew010 are we positive the slow down only affects GPTQ-LoRA and nothing else (e.g., full, regular peft). I remember you used to print out a table, can we check it also
Description
Regression Test for Loss, Memory,
Throughput Comparisons on loss, memory and throughput for Full-FT, PEFT
torch_dtype=float16
(Reference) totorch_dtype=bfloat16
(New).See Outliers
Subset of Outliers processed into this table