jpata / particleflow

Machine-learned, GPU-accelerated particle flow reconstruction
Apache License 2.0
24 stars 29 forks source link

Dev aug2024 #339

Closed erwulff closed 3 weeks ago

erwulff commented 1 month ago

Implements a pre-layernorm self-attention layer that can be enabled for the attention-based model by setting use_pre_layernorm=True in the config file.

According to this paper, pre-layernorm transformers are less sensitive to HP thus requiring less HPO.

erwulff commented 4 weeks ago

Test breaks because dataset is already generated at v2.1.0 but training config is still at 2.0.0 until the updated datasets with additional stats are copied.