variability v. source flux plots

gijzelaerr commented 9 years ago

In a paper we were discussing in our astroph meeting in API this morning, we discussed a nice plot which looks at variability as a function of source flux. The original plot uses WISE data for field stars to determine the noise and then the truly variable sources are offset from the field stars (relevant figure attached to this page). This appears to be easily adaptable into a plot which can be used in conjunction with the variability parameters for each source. Ultimately, I think we could use this for all known LOFAR sources in the catalogue and get some nice statistics - helping to identify the real variable sources. However as a first step it would be very nice to see this information for individual data sets.

On the dataset overview page can we add 2 new quality control plots showing a scatter plot with the variability values for all unique sources? I believe all the required information is in the catalogue...

Source integrated flux v. ην
Source integrated flux v. Vν For added bonus points, can this also have a different colour/symbol for the identified transients? Obviously, there will also need to be individual plots for the variability in each observing band when we have the variability information for each observing band.

I'm a little out of the loop with how work is progressing on Banana -- I'll try to get up to speed next week. But as a general principle, can I suggest that a really awesome way to start would be to have a brainstorming session where you (= developers + users, so at least Gijs + Antonia) sit down together with a blank sheet of paper (or whiteboard) and sketch out an overview of the design of the site and what information it will provide.

I'm not suggesting you need to write a massive design document (heaven forbid!), but I think having a collection of user stories to inform the design would definitely be helpful in avoiding some of the problems with the current site.

Thanks for the reminder about this issue! I think it would be really great if we could adapt the diagnostic plots from my paper into TraP as that would then tie together the methods I've developed for TraP and Banana. If everyone is happy with this, it also has the added advantage the plot design and code is already there - it would just need adapting for Banana. It would also be awesome to make them interactive with the highcharts options (i.e. selection of frequencies).

original issue:

https://support.astron.nl/lofar_issuetracker/issues/3765

gijzelaerr commented 8 years ago

I think the time is right to let @AntoniaR build the plot herself using an IPython notebook, SQLalchemy and matplotlib or bokeh.

gijzelaerr commented 8 years ago

@AntoniaR ever had a look at this?

AntoniaR commented 8 years ago

Yes, there are 2 plots which I make that would be very useful for Banana. This is on a list of additions for TraP/Banana which is in draft form (we had the first discussion a couple of weeks ago when you missed the pipeline meeting). So this is still very much on the to-do-list.

AntoniaR commented 8 years ago

So the two plots I am thinking of for Banana are relatively easy to make with the numbers that are in the database. Attached are examples from the RSM dataset.

rsm_scatter_hist This is simply a scatter plot of the two variability parameters with 1D histograms of each. An additional bonus is being able to draw lines at e.g. + 2 or + 3 sigma from the mean (user defined would be awesome) using the Gaussian fits in the top plot and state what the threshold is.

rsm_diagnostic_plots This plot is simply a scatter plot of the two variability parameters against the average flux of the source and the maximum flux that the source attains in it's lightcurve.

The code that I have written to produce these plots are part of a larger set of scripts, but specifically the plotting tools are here: https://github.com/AntoniaR/TraP_trans_tools/blob/master/plotting_tools.py But we only need the "else" part of the if statement here and in subsequent places - this code does a few other things which we don't need in Banana: https://github.com/AntoniaR/TraP_trans_tools/blob/master/plotting_tools.py#L26

I grab the data from the database and put it into the right format using the script here https://github.com/AntoniaR/TraP_trans_tools/blob/master/process_TraP.py But I'm sure all of this can be done in a much tidier fashion...

gijzelaerr commented 8 years ago

nice, thanks, having a reference implementation makes it much easier to implement, don't worry about the tidyness.

gijzelaerr commented 8 years ago

which function actually creates the plot? You say we only need the "else" part of line 26, which is the create_scatter_hist() function. But when I look at your plot in this issue I see for example 'Max Flux' as a label, which is defined in create_diagnostic(). Also can you describe how the trans_data and data structures look like? I've tried to back track how the data looks like but it is not trivial since you do manual manipulations on the queried data, write it to disk and read it back again in an other part of your program.

AntoniaR commented 8 years ago

The 2 plots are made with 2 separate functions in plotting_tools.py:

create_scatter_hist() creates the top plot with v_nu and eta_nu and the two histograms fitted with Gaussians.
create_diagnostic() makes the second scatter plot with the four planes

Again, this is pre SQLAlchemy and using some code I was given, so it is not the most efficient way of doing things. Also, some of the writing to disk is not important to the code - it is simply there to speed it up when I wanted to rerun the code to replot the graphs after making changes to the plotting code as the queries and reformatting took time.

It'll take me a bit more time to describe the data structures as I want to improve the comments in the code. I will try to upload the improved comments in the next few days.

transientskp / banana

variability v. source flux plots #46