Transform the conversations into a human-gpt conversation format

astroinfo-hacks / 2023-galaxy-zoo-llm

Train a large language model on galaxy zoo public conversations

MIT License

0 stars 0 forks source link

Goal is to create_dataset.py, where we take the dataframe and the images from scrap_notes.py and create a Dataset consisting of:

[ { "id": "000000033471", "image": "000000033471.jpg", "conversations": [ { "from": "human", "value": "What are the colors of the bus in the image?\n" }, { "from": "gpt", "value": "The bus in the image is white and red." }, { "from": "human", "value": "What feature can be seen on the back of the bus?" }, { "from": "gpt", "value": "The back of the bus features an advertisement." }, { "from": "human", "value": "Is the bus driving down the street or pulled off to the side?" }, { "from": "gpt", "value": "The bus is driving down the street, which is crowded with people and other vehicles." } ] } ]

astroinfo-hacks / 2023-galaxy-zoo-llm

Transform the conversations into a human-gpt conversation format #4