Closed bugface closed 1 year ago
adapt alibi to BERT in NER
https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/4c13c617bdfb3bd419afd620bae87c74ae5aa79d/megatron/model/transformer.py#L116
https://github.com/ofirpress/attention_with_linear_biases/issues/5
adapt alibi to BERT in NER
https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/4c13c617bdfb3bd419afd620bae87c74ae5aa79d/megatron/model/transformer.py#L116