Data has been generated for four datasets (stories, dailydialog, dailymail_cnn, mrpc) across four models beluga7b, mistral7b, llama2_chat7b and llama2_chat13b across three . Stories was generated with two diff. prompts!
Costs were not high, so we can definitely generate more data if needed at some point (or across more temperatures).
Some plotting (of length distributions) was also done!
Steps from here
I will run PCA and make the distance plots again for the new data
Need to do some spring-cleaning on the repo (get rid of old code, refactor etc.)
Generations Overview
Data has been generated for four datasets (
stories
,dailydialog
,dailymail_cnn
,mrpc
) across four modelsbeluga7b
,mistral7b
,llama2_chat7b
andllama2_chat13b
across three .Stories
was generated with two diff. prompts!Costs were not high, so we can definitely generate more data if needed at some point (or across more temperatures).
Some plotting (of length distributions) was also done!
Steps from here
spring-cleaning
on the repo (get rid of old code, refactor etc.)