PolymathicAI / AstroCLIP

Multimodal contrastive pretraining for astronomical data
MIT License
77 stars 12 forks source link

Dataset description #20

Open jere1882 opened 1 month ago

jere1882 commented 1 month ago

I am reproducing your experiments with your own data as well as data from a different survey.

I couldn't find any reference in the paper or in the code about the units in which the spectra data is represented in your dataset. Supposedly the data was obtained from DESI early release, in which case the spectra values are in 10^-17 erg s^-1 cm^-2 A^-1. Is this the case, or has the scale or units of the data been changed in the exported dataset?

Thank you very much, this project is amazing.

lsarra commented 1 month ago

Hi! Yes, I would say we are using the data as it comes from the release (i.e. exactly what you would get from DESI, see here https://github.com/lsarra/astrolit/blob/5d9c9fceba78d2c0be7f5c36e45cf97940a2e583/astro_utils.py#L188-L198)

But for a definitive answer let's see what @EiffL or @lhparker1 say!

jere1882 commented 2 weeks ago

thank you very much for your prompt response. Actually, it looks like the dataset you exported is applying some type of centering to the data, since the flux has negative values, while fluxes should always be positive. image