huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
131.73k stars 26.23k forks source link

Add BART-LS #20392

Open KMFODA opened 1 year ago

KMFODA commented 1 year ago

Model description

BART-LS (Long Bart), presented in this paper, establishes a new SOTA on a number of NLP tasks and long form datasets. It uses pooling-augmented block-wise attention and a novel pre-training strategy to achieve this.

Given my interest in long text summarisation I'm very keen to get this into the wonderful transformers library and to start benchmarking it against other models. Therefore, happy to take this on and ping any members of the team if I face any blockers. If this fits with the library's plans let me know and I'll start working on a PR for this.

Open source status

Provide useful links for the implementation

Original Model Repo (which includes the model weights): https://github.com/facebookresearch/bart_ls

thakursc1 commented 1 year ago

Any update on this ? @KMFODA I can help please let me know if you want to collaborate on this ?

KMFODA commented 1 year ago

Hey @thakursc1 I'm still waiting on someone from the HF team to confirm if this can be integrated into their codebase if we work on this as this only becomes beneficial for my use case if I can use it in the transformers master branch.

Happy to collaborate on this as soon as we hear back.

jmzeng commented 1 year ago

Hey @KMFODA, wondering if there are any updates on this? Thanks!

KMFODA commented 1 year ago

Hey @jmzeng. I haven't heard back from anyone from the HF team yet and unfortunately a few things have changed in my workloads and I don't think I'll be able to work on this. Maybe someone else can work on this if they have the bandwidth and ping the HF team when they have a draft PR for them to review?

amyeroberts commented 1 year ago

@KMFODA BART-LS looks like it would be a great addition to the library :)

If you or another community member would still like to add the model, please feel free to open a PR and let us know in the meantime if there's any difficulties integrating it.