Digital-Defiance / nlp-metaformer

An ablation study on the transformer network for Natural Language Processing
3 stars 0 forks source link

experiment: evaluate training performance of small model #52

Closed RuiFilipeCampos closed 5 months ago

RuiFilipeCampos commented 5 months ago

(I'm also still testing the pipelines)

http://localhost/#/experiments/1/runs/69231cb1d27f4cd8ba732fc360239bb5

Configuration Value
attention metric
batch_size 1
beta_1 0.9
beta_2 0.98
bias False
coordinates 100
epsilon 1e-09
l1_regularization 0.0
l2_regularization 0.0
lr_schedule_scaling 1.0
number_of_blocks 1
number_of_epochs 1
number_of_heads 10
number_of_parameters 5,190,850
number_of_slices 50
tokens 50,263
warmup_steps 4000
words 624

loss/train(step)

newplot(35)

RuiFilipeCampos commented 5 months ago

loss/train(step)

newplot(25)

RuiFilipeCampos commented 5 months ago
2024-02-11T17:52:09.7160680Z 2024-02-11 17:52:09,715 [INFO] ---------- Step 34406 ---------
2024-02-11T17:52:09.7161925Z 2024-02-11 17:52:09,715 [INFO] Cleaning up memory...
2024-02-11T17:52:09.8586044Z 2024-02-11 17:52:09,858 [INFO] Called garbage collector.
2024-02-11T17:52:09.8645220Z 2024-02-11 17:52:09,864 [INFO] Emptied gpu cache.
2024-02-11T17:52:09.8646439Z 2024-02-11 17:52:09,864 [INFO] Fetching slice 22 from worker...
2024-02-11T17:52:11.0965984Z Killed

Unsure why it got killed. but I'm guessing memory leak

RuiFilipeCampos commented 5 months ago

newplot(26)