Open jebarpg opened 1 year ago
Hi, thanks for your interest in our work! From my understanding of the LongNet paper, the main idea of FoT which is training on negative examples while utilizing longer context, and the dilated attention from LongNet seem pretty orthogonal, which would make combining these two methods an interesting research direction to explore!
https://arxiv.org/abs/2307.02486 Scaling to 1 billion context length paper in addition to this seems like it would solve the pursuit of infinite context length. Also FoT feels similar to L2P learn to prompt which integrates a pool of prompts to help get over the forgetful issues while applying continuous learning to a model... Maybe there could be both the database of kvs accessed via knn that blends well also with L2P... Plus the LongNet dilation algorithm could definitely benefit from contrast learning too.
Thoughts?