Open Richard-LZ-Zhang opened 1 year ago
Hi Richard, thanks for your interest.
Thank you for your response. 1. Do you mean few-shot learning as fine-tuning on a few examples? 2. I could only see per_device_train_batch_size=8 and max_steps = 2e5. I can not derive how many epochs you trained for. Can you provide more information?
Hi, my name is Richard Zhang, a researcher based in the Engineering Department, University of Cambridge. Your method of curating a CiteSumm dataset for pretraining, and then achieving SoTA by a few-shot is truly amazing! I have two questions. First, you did 128-shot learning on CITES to evaluate the SciTLDR dataset. How did you fit 128 examples into the context window (1024, I believe) of BART? My second question is, when you pre-train on CiteSum, did you train for just one epoch? Any comment is appreciated!