g-simmons / persona-research-internship

1 stars 0 forks source link

Task: Run longer prompt skywriting experiment on workstation #99

Open g-simmons opened 8 months ago

g-simmons commented 8 months ago

I added the prompt_skywriting subfolder to this repo, along with some DVC data that I used in a more recent experiment.

The dataset is on huggingface at Abirate/english_quotes. It's a collection of quotes from celebrities.

I prefer this dataset since there are less curse words/sexual innuendo.

Also the text segments seems longer and more grammatical overall, and the dataset is bigger (~2k examples vs ~700 examples).

prompt_skywriting_quotes.ipynb is my most recent version of this, I would start there and modify it to store information that's needed for Thanh's visualization.

thanhyto commented 8 months ago

For the JSON object, here is an example for it at the moment (object with 2 unique keys Index and 2d for x and y coordinates)

{ Index: 56, "2d": [1.1564289331, 3.9473571777] }

The x and y coordinates can also be separated into two keys { Index: 56 , x:1.1564289331, y:3.9473571777}

Every other keys are optional if we have data on it. For example: { Index: 56 , x:1.1564289331, y:3.9473571777, prompt: "Prompt content", author: "author_name"}