Closed irasin closed 6 months ago
Also, I was wondering if I change the dataset for bayesian on the same model, let say XSum data on llama2-13B and some other custom data on llama13B, will the skip-layers be different in this two cases?
Thank you for your appreciation of our work. For the BO data set settings, if the tasks to be tested are very different, we recommend searching at the task level, so the skip layers of different datasets may be different; We have no plans to experiment on the 7B model yet, but the redundancy of their pairs may be smaller.
Great works! I have some questions about the Bayesian optimization and performance on the small model size:
Hope to get answer. Thanks a lot.