I slightly modified the scripts/convertDataSet.py script from @Vadym4 to do two things:
Optionally download the images from the URLs provided in the dataset (it only takes a few minutes)
Fixed an issue where the code would fail to download .png images
I added a new script called scripts/generate_qa.py which sends requests to the OpenAI API, and can handle rate limits and some errors in order to generate answers.
For point 2. I merged together the prompts from the get_description and get_conversation script that @errai34 @Vadym4 and @sona-patel developed, and it seems to work very nicely.
I've pressed go to run this script over our 100k instances of conversations and should have the results by the morning.
prompt = """
You are an AI assistant specialising in astronomical topics.
You are provided with the following conversation between galaxy zoo users
that comment on an astronomical image. Unfortunately, you do not have access to the actual image.
Conversation:
------
%s
------
End of conversation.
Answer the question asked below between you and a person asking about this photo.
The answers should be in a tone that a visual AI assistant is seeing the
image and answering the question.
Below are the requirements for generating the answer:
1. Avoid quoting or referring to specific facts, terms, abbreviations, dates, numbers, or
names, as these may reveal the conversation is based on the text information, rather than
the image itself. Focus on the visual aspects of the image that can be inferred without
the text information. \
2. Do not use phrases like "mentioned", "caption", "context" in the conversation. Instead,
refer to the information as being "in the image." \
3. Do not use your knowledge to interpret the image and keep the answers short. \
Now, please respond to the question below as if you were describing the image in the
style of a professional astronomer.
Question: %s
Answer:""" % (conversation,question)
And here are a few examples of answers:
Example 1
Question: Analyze the image in a comprehensive and detailed manner.
Answer: Based on the image, there is a red object and a whitish object at 5 o'clock. The red object appears to be a foreground star from our galaxy, while the whitish object is also likely a local star. The image also shows a blue object that resembles an elliptical shape. However, due to its blue color, it suggests the possibility of star formation, indicating that it could be a spiral or a blue elliptical. Overall, the image showcases a variety of objects with different colors and characteristics.
Example 2
Question: Offer a succinct explanation of the picture presented.
Answer: The image shows a galaxy with a central bar structure and a ring-like feature. There are faint arms extending from the bar, indicating a spiral morphology. The image quality is not optimal, but the presence of enhanced brightness at the ends of the bar suggests resonance nodes. The image does not appear to show a lenticular galaxy or a barred lenticular galaxy. Overall, the image showcases a unique and interesting galactic structure.
Example 3
Question: Write a terse but informative summary of the following picture.
Answer: The image shows a galaxy with a distinct feature in its luminosity distribution. The feature appears as a relatively uniform or shallow zone with a sharp edge. It is commonly referred to as a "lens" and is often observed in S0 galaxies. It is important to note that in this context, the term "lens" does not refer to a gravitational lens.
also tagging @mwalmsley because I think this looks really really cool already as a training set for our LLaVA model, curious to see if you have any additional ideas/inputs.
This PR provides essentially two changes:
I slightly modified the
scripts/convertDataSet.py
script from @Vadym4 to do two things:I added a new script called
scripts/generate_qa.py
which sends requests to the OpenAI API, and can handle rate limits and some errors in order to generate answers. For point 2. I merged together the prompts from the get_description and get_conversation script that @errai34 @Vadym4 and @sona-patel developed, and it seems to work very nicely.I've pressed go to run this script over our 100k instances of conversations and should have the results by the morning.
Here is the GPT 3.5 prompt I'm using:
And here are a few examples of answers:
Example 1
Question: Analyze the image in a comprehensive and detailed manner. Answer: Based on the image, there is a red object and a whitish object at 5 o'clock. The red object appears to be a foreground star from our galaxy, while the whitish object is also likely a local star. The image also shows a blue object that resembles an elliptical shape. However, due to its blue color, it suggests the possibility of star formation, indicating that it could be a spiral or a blue elliptical. Overall, the image showcases a variety of objects with different colors and characteristics.
Example 2
Question: Offer a succinct explanation of the picture presented. Answer: The image shows a galaxy with a central bar structure and a ring-like feature. There are faint arms extending from the bar, indicating a spiral morphology. The image quality is not optimal, but the presence of enhanced brightness at the ends of the bar suggests resonance nodes. The image does not appear to show a lenticular galaxy or a barred lenticular galaxy. Overall, the image showcases a unique and interesting galactic structure.
Example 3
Question: Write a terse but informative summary of the following picture. Answer: The image shows a galaxy with a distinct feature in its luminosity distribution. The feature appears as a relatively uniform or shallow zone with a sharp edge. It is commonly referred to as a "lens" and is often observed in S0 galaxies. It is important to note that in this context, the term "lens" does not refer to a gravitational lens.