Open g-simmons opened 8 months ago
For the JSON object, here is an example for it at the moment (object with 2 unique keys Index and 2d for x and y coordinates)
{ Index: 56, "2d": [1.1564289331, 3.9473571777] }
The x and y coordinates can also be separated into two keys
{ Index: 56 , x:1.1564289331, y:3.9473571777}
Every other keys are optional if we have data on it. For example:
{ Index: 56 , x:1.1564289331, y:3.9473571777, prompt: "Prompt content", author: "author_name"}
I added the prompt_skywriting subfolder to this repo, along with some DVC data that I used in a more recent experiment.
The dataset is on huggingface at Abirate/english_quotes. It's a collection of quotes from celebrities.
I prefer this dataset since there are less curse words/sexual innuendo.
Also the text segments seems longer and more grammatical overall, and the dataset is bigger (~2k examples vs ~700 examples).
prompt_skywriting_quotes.ipynb is my most recent version of this, I would start there and modify it to store information that's needed for Thanh's visualization.