Closed drphilmarshall closed 6 years ago
I've run the notebook and all looks great! As I mentioned in a comment, I think the make_training_sets
may need a helper function (called something like _preprocess_for_training
) that gets rid of null values, etc.
Also, since this is one "chunk" of the sky, eventually we'll need to wrap make_training_sets
in another function or loop that points at a list of regions, right? We may need to parallelize this (if we want a few million examples) but this should be straightforward!
Out of curiosity, would you want to keep working with a single-tract CoAdd? I'm wondering how much variability in optical PSF there is across tracts in minion_1016
if any. Will this tract be representative of other tracts?
Thanks @jiwoncpark !
Maybe
object_id
shoudn't be inX
. You'll need it for joining with the extragalactic table though. I'll issue a helper function we might need to preprocess theX
andY
further for training, and theobject_id
deletion can go in there if you'd like!
That sounds good: X
and y
should be training-ready, I think - so I guess we just need to carry the IDs around as well, but as separate dataframes. I'm not sure we need another function, but let's see! Thanks for issuing that separately in #5.
Separately, is
<band>_modelfit_CModel_fracDev
the flux ratio of bulge to total?
Yes, I think so. I was just looking for some familiar properties that might be predictable by an ANN.
Also, since this is one "chunk" of the sky, eventually we'll need to wrap
make_training_set
in another function or loop that points at a list of regions, right? We may need to parallelize this (if we want a few million examples) but this should be straightforward!
Yes - I had not got that far... I was assuming we'd be able to refactor and it'd be easy :-)
Out of curiosity, would you want to keep working with a single-tract CoAdd? I'm wondering how much variability in optical PSF there is across tracts in minion_1016 if any. Will this tract be representative of other tracts?
Oh! Yes, I was thinking to try working with the whole catalog, not just a single tract - that's why I started %%time
-ing things, in fact... I'll try this.
Thanks very much, @jiwoncpark ! Next up: training a model :-)
Here's a first go at a
derp.Emulator
class, and a notebook that derives and demo's itsmake_training_set
method. Not much to look at, but just following @danielsf and @yymao ' s DC2 tutorials gets us a useful-looking design matrixX
and corresponding (multivariate) response variablesy
. Can you take a quick look at this please @jiwoncpark , and let me know if I have made the right things for thepytorch
models to ingest? All comments welcome - its a bit rough round the edges, but then it is sprint week.This closes #2 , when merged.