WD-planets / AstroToolkit

AstroToolkit Package
Other
2 stars 0 forks source link

Save light curve (and other) data #2

Closed BGaensicke closed 10 months ago

BGaensicke commented 10 months ago

It would be good to save the data that the toolkit retrieves, so one can e.g. make another custom-style plot, or dig into the analysis. At the moment, running e.g.

python datapage_creation.py 159891122647000192

(I replaced the fixed source_id with a command-line parameter) creates a file "lightcurve_data.csv" which contains the ZTF data, I think. However, creating a bunch of HTML pages will overwrite this file.

We'd need to invest a bit of thought into the file name convention: the simplest thing would be "source_id"_ztf.csv or something like that, however, personally I really struggle with the source_id, as they are just a long set of random numbers. One way around this would be to use a coordinate-based string, JHHMMSS.SS+DDMMSS.SS, following the convention that Nicola defined in his white dwarf catalogue paper (https://ui.adsabs.harvard.edu/abs/2019MNRAS.482.5222T):

Screenshot_select-area_20240102130513

To cater for all use cases, the root file name could be JHHMMSS.SS+DDMMSS.SS_"source_id"_ztf.csv.

EthanJMoorfield commented 10 months ago

Hi Boris,

I will take a look at this - I think the best way to implement this would be to separately include this as an option in all query tools. The file you mention is currently only intended to be a temporary file for use in period analysis (which is then removed afterwards).

You can of course just grab the raw ztf data using the 'ztfquery' tool and then save this to a file manually using pandas' .to_csv() but it would definitely be nice if there was an option to automatically do this.

As far as the naming convention, I agree that we should cater for all use cases. Since the Toolkit's accepted input formats are coordinates and Gaia source_id's, I think it's necessary that whichever one was used stays in the filename as they are then also an indicator of the input used to produce the data, and could be fed back into the Toolkit without the need for any transformation.

JHHMMSS.SS+DDMMSS.SS_"source_id"ztf.csv (or JHHMMSS.SS+DDMMSS.SS"ra"_"dec"_ztf.csv) sounds good to me, and should be easy enough to implement as the transformation functions already exist within Toolkit so it would just be a matter of stringing them together.

Thanks,

Ethan

EthanJMoorfield commented 10 months ago

I have now added this, you can save basically any data by adding 'save_data=True' to the tool you are using (e.g. to save lightcurve data, you can use

getztflc(source=...,save_data=True)

The naming convention is as follows:

for pos input, files are named:

ra_dec_identifier.csv

where identifier is e.g. 'ztflc' for ztf light curves.

for source input, files are named:

J..._source_identifier.csv

where J... is the convention above.

I didn't think it made sense to use the J... convention for position input since this can't be proper motion corrected and wouldn't really be an identifier since it would then just be defined at all points in space.

BGaensicke commented 10 months ago

The naming convention sounds ok. I upgraded the toolkit, but could not get saving the light curve to work. I had a look into ZTF.py, and it still has "return_raw"

def getLightCurve(ra,dec,radius=3,return_raw=False):

Maybe this is not yet the latest version?

EthanJMoorfield commented 10 months ago

return_raw just tells the ztf routines to return the 3 plots as separate figure objects (i.e. as [g,r,i], mostly used for creating grids for data pages) without combining them into a single figure, so this is fine - the actual tools you access are all stored within Tools.py which basically acts as a map to all the sub-functions.

I'm not sure why save_data wouldn't be working for ztf but I must have missed something somewhere and forgot to test it, in all the cases I did test save_data works so I'll fix it later today once I arrive at uni.

Thanks for letting me know

EthanJMoorfield commented 10 months ago

I have fixed it, I forgot that ZTF tools return multiple dataframes.

BGaensicke commented 10 months ago

OK, saving the light curve works now.