Melanie Beck ( @melaniebeck ) at the University of Minnesota has, with Lucy Fortson ( @lfortson ) and Kyle Willett ( @willettk ) and Hugh Dickinson ( @hughdickinson ), been thinking about applying the SWAP code to analyze Galaxy Zoo classifications. Their idea is to have future instances of the GZ project just ask the volunteers a sequence of yes/no questions (eg "Is this galaxy smooth or not?" "Does this galaxy have a bulge of not?"), and so are working on extending the SWAP code to deal with this multi-question scenario. Another of their ideas is to train various machine learning routines to emulate human classifiers, by providing them with small training sets and the Zoo-generated labels.
Melanie forked the repo back in October, and has started experimenting with these things, which is great - especially as we have also started thinking about how to work with both crowds of volunteers and ML algorithms at the same time. Our goal should be to work together to improve the SWAP code so that it works on both "GZ Express" (this proposed future GZ binary classification project) and Space Warps DES.
There are a number of things we need to do!
[ ] Discuss the GZ and SW applications together. We can use this thread for that conversation - all thoughts welcome :-) One question for us could be, do we want to spin out SWAP into its own repo, separate from the SW and GZ papers, for example? This repo might get a bit cumbersome. You can browse Melanie's fork to get a sense of what changes she has made by reading her README file here.
[ ] @melaniebeck is 40 commits ahead, so we need to work to get her code merged back in to the base repo. There's a few things that we need to do before this happens:
[ ] SWAP must still work on the Space Warps dataset. This means that we need to enable some tests that run on a small, test SW mongodb (downloadable from somewhere to enable auto-testing with travis-ci) and that can be used to make sure that the code is not broken.
[ ] The config and python files must also be set up with appropriate options etc so that the SW analysis still works.
[ ] SWAP must also work on a small test GZ Express database (also downloadable from somewhere, for travis) #229
[ ] The config and python files must also be set up with appropriate options etc so that the GZX analysis still works.
We can make further issues based on this list, and then link them back here. Thanks for all your efforts so far, Melanie! It'll be great to all be pushing the same code forward for both projects :-)
@aprajita @anupreeta27 @cpadavis :
Melanie Beck ( @melaniebeck ) at the University of Minnesota has, with Lucy Fortson ( @lfortson ) and Kyle Willett ( @willettk ) and Hugh Dickinson ( @hughdickinson ), been thinking about applying the SWAP code to analyze Galaxy Zoo classifications. Their idea is to have future instances of the GZ project just ask the volunteers a sequence of yes/no questions (eg "Is this galaxy smooth or not?" "Does this galaxy have a bulge of not?"), and so are working on extending the SWAP code to deal with this multi-question scenario. Another of their ideas is to train various machine learning routines to emulate human classifiers, by providing them with small training sets and the Zoo-generated labels.
Melanie forked the repo back in October, and has started experimenting with these things, which is great - especially as we have also started thinking about how to work with both crowds of volunteers and ML algorithms at the same time. Our goal should be to work together to improve the SWAP code so that it works on both "GZ Express" (this proposed future GZ binary classification project) and Space Warps DES.
There are a number of things we need to do!
We can make further issues based on this list, and then link them back here. Thanks for all your efforts so far, Melanie! It'll be great to all be pushing the same code forward for both projects :-)