Open chenmoneygithub opened 1 year ago
I think something happened with the title @chenmoneygithub
@chenmoneygithub I wonder if the Perceiver arch is a good fit (https://arxiv.org/abs/2202.07765).
It is specially motivated to accomodate long sequences.
@chenmoneygithub this dataset seems cool: https://huggingface.co/datasets/openai/summarize_from_feedback
One interesting part is how we handle long context since our models have a limit on the input length due to positional embedding.
Ideally we should ship a task model for text summarization, but not limit the input size, otherwise it's more like a toy.