astrocatalogs / supernovae

Astrocats module for the Open Supernova Catalog (OSC). Please use the issue tracker on this repo for all OSC issues.
https://sne.space
MIT License
51 stars 19 forks source link

Task to pull in ZTF public data for TNS SNe #185

Open emirkmo opened 3 years ago

emirkmo commented 3 years ago

(Apologies if this is not the right place.)

But it would be great if the OSC used the ZTF identifier from TNS to then query one of the brokers for public ZTF lightcurves and pull in this data.

The reason I'm opening this issue is that I was pulling data on a bunch of SNe from OSC and I had to supplement the data in this way myself. I used the ALERCE broker's nice API for this.

Just as a random concrete example: OSC SN2020sbw has ZTF identifier ZTF20abwzqzo which is already pulled in from TNS. Looking at e.g., Alerce for this object, the lightcurve is much more filled out with public ZTF points. So I supplemented the OSC photometry with the full public ZTF dataset.

We do something similar as a part of our FLOWS candidate-marshal script, and all its doing is using the Alerce API. Wouldn't it make sense for OSC to do this too?

One downside I see is of course photometry duplication since a few of these data points are also on TNS as discovery/classification data points. But it should be possible to remove these duplicates as you do for spectra and so on.

let me know what you think and I can put together a mockup using astrocats following the do_tns_photo in the tns.py task.

guillochon commented 3 years ago

Hi @emirkmo, I'd be happy if you added a new task that pulled in data from Alerce. I think the best way to do it would be as a new "task" that loops through all the supernovae in the DB and queries Alerce. Ideally we can do this with as few API calls as possible rather than having to do a separate call for each event. Do you know if that's possible to do with their API?

emirkmo commented 3 years ago

Hi, I forgot to respond. I looked into this, and it is very doable, but there didn't seem to be a bulk api that does exactly what we would want. We could of course query individually, and it is probably not a big deal to do so..

But in either case, we should contact the people hosting the data because either the number of queries, or their sizes, will be large. As I am not a part of the ALERCE group, I didn't think it's a good idea to do this without running it by them.

If you're interested in the version that queries individual objects, I can put that up. But I couldn't find a bulk solution. Maybe if there's a broker that allows sql queries.. that would be ideal.