Closed Flock1 closed 4 years ago
Do you have an example of these files? I was unaware that SDO data came in npz
files.
You should be able to load them using numpy.load
and then create a sunpy map directly but I am unsure about the header information for these files.
Are you referring to the ML dataset (e.g. https://purl.stanford.edu/vk217bh4910) as described in Galvez et al, (2019)?
Note that these data are not provided by any of the instrument teams and are meant as a resource specifically for machine learning. They shouldn't be regarded as official instrument data products. In the case of AIA, they have been heavily downsampled from their original resolution (4096-by-4096 to 512-by-512). Additionally, they contain no header information so I do not think we could ever support these as as valid inputs to Map directly.
As Nabil points out above, you can of course load these just using Numpy and then create a Map
object "by hand." If you knew the original image from which your reduced image was created, you could grab the metadata from that file and then create a new header for the reduced image using make_fitswcs_header
To follow on from Will and Nabil, you can find instructions on how to use this dataset at https://github.com/dfouhey/sdodemo.
@nabobalis, yeah. header information is definitely missing from that dataset. And I guess as @wtbarnes mentioned, the data I have is different from the original data. I have images that are of shape 1024X1024 and I am trying to use these images for machine learning.
So if I am interested in using ML for this dataset, how do you recommend I should use sunpy
library? One major application of this library is the grid functionality, which will definitely help in giving a sense of a sphere instead of a circle. My astronomy knowledge is very limited hence kindly don't mind the naiveness of my questions.
@Flock1 sunpy maps need metadata to be created, these normally come from the data files that are used in astronomy. If these are missing you can create them manually (following https://docs.sunpy.org/en/stable/generated/gallery/map/map_from_numpy_array.html#sphx-glr-generated-gallery-map-map-from-numpy-array-py) and create sunpy maps but I am not sure how useful that will be.
If you need to ML (I do not have any experience in ML), don't you just want the raw data?
@Flock1, we excluded this header information by design.
I don't fully understand why these need to be converted in to sunpy maps, but you can learn more about various coordinate systems here: https://fits.gsfc.nasa.gov/wcs/coordinates.pdf. Understanding what problem you're trying to tackle would be helpful, I think.
@nabobalis, thank you for the link. I will have a look at it.
Raw data is fine but it hasn't been much of use for me till now, especially when it comes to solar flares. But I think I'll need to read some text about this field to get a good grasp.
@PaulJWright, thank you for replying. One problem that I was thinking of is to detect solar flares through some unsupervised learning and predicting future solar flares. What do you suggest?
The data set you're referring to has co-aligned, co-temporal coronal (AIA), magnetic field (HMI) and integrated spectra (EVE). This data set does not include a flare catalog.
You would first need to query a catalog such as the Heliophysics Event Knowledgebase (HEK, https://docs.sunpy.org/en/stable/guide/acquiring_data/hek.html) to locate flares in this ML data set. Because there is 20PB of SDO data, we reduced the cadence significantly. This may not be appropriate for flare prediction, but the code that created this ML data set is on Github (https://github.com/SDOML/SDOML) and in theory you can create it yourself at a higher cadence.
Personally, if you are interested in learning more about the field, or want to quickly apply ML to the problem, I would recommend trying one of the example cases listed in Galvez et al (2019). For example, you could try infer coronal images (AIA) from just the magnetic field data (obtained by HMI). The data in the ML data set is in the right format to be fed straight in to a CNN, and would just need to be loaded with numpy
@PaulJWright, this is great. Thank you for the detailed response. I will try the example you mentioned.
Description
A lot of data provided by SDO is in
.npz
format. I see that (or at least till now with whatever I have explored), there's no way to work with that file. So is there some way I can import that data using sunpy? I didn't know where else to post this query since it's not a bug or an error.Please let me know.