Generating dataset from open source models & evaluation "understanding"

I first want to say great work showing the power of this approach! its very cool how I can run the small Flan T5 models on my laptop without a gpu and get good results (especially when connected to wikipedia or a search engine to make it have access to content/facts).

I have the following questions/observations:

1) I was wondering if the same dataset generation approach can be applied to open source big models like the 20B Eleuther GPT NeoX, and/or other models as they come out, what kind of results could be attained?

2) I also think some sort of evaluation that doesn't rely on the model having memorized a lot of information (as the smaller models have less parameters and room to store things) might be more suitable. Getting small models to "understand" new instructions and complex/nuanced ones should be the goal.

Thank you

mbzuai-nlp / LaMini-LM

Generating dataset from open source models & evaluation "understanding" #3