Hello, thanks for open-sourcing your code, this is a very inspiring work. I had a question: do you perform or experiment with bootstrapping continuously? Like currently it is train on pre-trained dataset, fine-tune on coco-retrieval, captioning, and then bootstrap to get new pre-train dataset. Now, pre-train on this and repeat the bootstrap process again?
Hi, thanks for your question.
Multi-round bootstrapping can be expensive, thus we have not tried it out yet. We do expect that continuous bootstrapping would give further performance improvement.
Hello, thanks for open-sourcing your code, this is a very inspiring work. I had a question: do you perform or experiment with bootstrapping continuously? Like currently it is train on pre-trained dataset, fine-tune on coco-retrieval, captioning, and then bootstrap to get new pre-train dataset. Now, pre-train on this and repeat the bootstrap process again?
How expensive is this operation?