Closed tholoien closed 8 years ago
Demo notebook has been updated substantially. Please take a look and make edits as you see fit.
Two notes:
Tom, this is great! You are generating supernovae at realistic positions with realistic SALT2 parameters :-) @rbiswas4 will be delighted! I see that only light editing of the text is required, I'll do that.
This is awesome!
I tried running it but got stuck at the corners ... probably some kind of version incompatibility, that I have not sorted out. You should add your XDGMM package as a requirement somewhere.
This is also timely: Aside from trying to use this in SN simulations at a catalog level, we would also like to use this for images. And as @drphilmarshall might have told you, we are at a point where we are discussing plans for a new Twinkles simulation. Do you think it would be worth discussing (a) if we could use empericSN for that purpose and (b) how to make sure we have all the 'requirements' covered? Maybe over a telecon at some point?
Here are a few questions/comments:
Thanks Rahul! I'd better leave the technical questions to @tholoien. I think getting some empiricSN into Twinkles is an excellent idea - let's do it! I'm sure Tom would love to help out if we get stuck (as we almost certainly will :-).
Hi guys, sorry for the slow response, been traveling. Thanks for the detailed comments Rahul. My responses to your questions are below:
I would be happy to discuss incorporating empiriciSN into the next Twinkles simulation via Skype or phone at some point. I am going to be fairly busy in the coming weeks catching up on some things related to my thesis that have been waiting over the summer, and I am going to be applying for jobs in the Fall, so I think I would prefer to keep my involvement to a minimum, but I am definitely willing to work with you guys to make this work for you---that was one of our primary goals in making it! I think probably the best solution is going to be discussing what exactly you would need the tool to produce for you, and then we can tweak things as necessary to make it work.
The sample we trained the model on has ~1400 SNe ranging out to a bit beyond redshifts of 1 (from SNLS and SDSS), and there is definitely a correlation between redshift and the x0 parameter. I trained on x0 directly by design, as hosts with different redshifts definitely do give different x0 values. My thinking on this was that we want to be able to get x0, x1, and c given the host parameters (redshift, separation, color, and local surface brightness), and redshift allows the model to narrow down the acceptable x0 values by quite a bit. We could certainly try training it using a different quantity though.
I am enthusiastic about getting SN properties from this method. But the question is whether to train on intrinsic propetries of SN or observed properties of SN that reflect the intrinsic properties + cosmology, effectively giving you a much harder problem of learning both the distribution of intrinsic properties and the cosmology. Providing a cosmology will bias your results (if the cosmology is wrong), but in what we will be using this (ie. simulation), everything will be wrong if the cosmology does not make sense! So I am not too worried about the possibility of bias. I would worry that the method would reliably explore the distributions without this additional prior.
But, we could test out these statements: I think what we would need to have is a sample of test galaxies spanning a redshift range of 0-1.2 (say).
The SDSS host photometry, which is what I used for all the host information, provides only information for an exponential or de Vaucouleurs profile fit to the host photometry, and this is what we used to train the model. So all the surface brightnesses and radii come from those fits. The SDSS profile fits do give B/A ratios and position angles (rotation in the plane of the sky), so we could try to incorporate those into selecting a proper position. I'm not sure what the best way to do this would be...I don't think we want to include those quantities in the model fit necessarily, but perhaps we could do something like select a radius in the way we currently do it, and then use the angle and axis ratio to somehow select an actual position in the host. I'll have to think on that for a bit.
I think that thinking sounds good to me.
For how sensitive the model is to changes in the SALT parameters, I'm not sure, since I've only ever trained it on the existing ones. It would be very easy to swap out the Sullivan et al. ones with the JLA ones and redo the fit to compare, so if you can point me to a good place to get those, I can do that. Would there be SALT parameters for both the SDSS and SNLS samples?
Yes, it should be easy. This is the set I would recommend: http://cdsarc.u-strasbg.fr/vizier/ftp/cats/J/A+A/568/A22/tablef3.dat This has parameters for all the SNLS supernovae used in cosmology fits which seems to be what you were using. It has parameters for SDSS SNIA supernovae used in JLA (a total of ~500 if I recall correctly), but not the ~1400 SN that you have (Are those photometrically identified?).
Note: Another thing is that SNLS and SDSS often use different conventions for x0 (an easy way to check would be to see if the x0 values of the SDSS supernova in the above link (what I would call the SNLS/SALT convention) are systematically different from the ones you were using for SDSS by a factor. You should then change the x0 values of all SDSS supernovae to account for this convention difference.
I could definitely produce plots showing the distributions of any of the host or SN parameters used in the fit with respect to each other. If you take a look at the PlotCorr notebook in the repo, that contains plots of the SN parameters vs. all the host properties we used to train the model, taken directly from the data. It would be easy to sample a few thousand data points from the trained model and plot those results too, if that's more what you're looking for. In theory, the XDGMM model should recover the underlying "true" distribution from the noisy data used in the fit, so it could give a better sense of what the actual distribution looks like.
I have looked at that notebook, and I think the essential additions that I am interested in are:
I would be happy to discuss incorporating empiriciSN into the next Twinkles simulation via Skype or phone at some point. I am going to be fairly busy in the coming weeks catching up on some things related to my thesis that have been waiting over the summer, and I am going to be applying for jobs in the Fall, so I think I would prefer to keep my involvement to a minimum, but I am definitely willing to work with you guys to make this work for you---that was one of our primary goals in making it! I think probably the best solution is going to be discussing what exactly you would need the tool to produce for you, and then we can tweak things as necessary to make it work.
OK. Maybe @drphilmarshall and I should try settling on what we want and give your model a shot and get back to you when we are stuck (Of course we will keep you informed of our attempts!)
Hi Rahul (and Phil),
I apologize for being slow to respond, I've had some family matters come up in the last week and had to travel unexpectedly.
I have been making some edits to finalize the XDGMM class for a paper we are writing up on it, and I want to do the same for EmpiriciSN now. I know the next Twinkles simulation is happening, so I wanted to get in touch to see if we can make this work and get it incorporated before it's too late. (If possible.)
In response to your message, is there a column description for the table you linked to of SALT2 parameters? I tried digging around on Vizier but couldn't find it. I am using the SDSS supernovae classified as "SNIa" (spectroscopically confirmed) or "zSNIa" (photometrically identified with a host redshift). I am hesitant to change the SALT parameters if it means drastically reducing the size of the dataset, since it's already a little on the small side, and I want the SALT parameters all coming from the same source for consistency, so my inclination would be to not change them if I can't find them for the whole sample.
For the x0 parameter, the SDSS ones come from the Sako et al. dataset, while I actually calculated the SNLS ones myself. The SNLS source I used provided x1, c, a redshift, and a peak rest-frame B-band magnitude, so I used SNCosmo to calculate the x0 parameter from those.
Anyway, my question now is what are the key things you would need changed/updated to incorporate EmpiriciSN into Twinkles at this point? (e.g., what needs to be done now vs. what tests/etc. would need to be run at some point, but aren't needed right now?) Depending on how much needs to be done, maybe it won't be possible to incorporate it this time, but I would like to get EmpiriciSN into a somewhat final "production state" so that I can write about it in our paper. That basically means I want to have all the necessary functions (fitting a model, choosing a radius, sampling SN parameters) working; the datasets can always be changed later.
@tholoien
Thanks for getting in touch ... everyone is busy with things that have to be taken care of, so that is perfectly understandable.
Let us split this issue into three threads to keep track of it.
I will start the Twinkles thread, and we can start discussing the other ones.
Add a demo notebook to demonstrate the capabilities of the class.