cmateu / galstreams

Milky Way Streams Footprint Library and Toolkit for Python
BSD 3-Clause "New" or "Revised" License
46 stars 17 forks source link

[WIP] Use astropy and restructure #18

Closed adrn closed 3 years ago

adrn commented 5 years ago

Hey @cmateu! This is nowhere near done, but I wanted to open this to show that I have at least been working on some aspects of an update. But before I do more (since this is already a pretty big change to the code!), I wanted to bring up a few ideas I had to hear your thoughts.

I started thinking about what information I want for a given stream, and decided that what I really want for each stream is (1) a mean track (along the sky, possibly with width, and possibly also with distance, proper motion, RV information when that exists), (2) a heliocentric sky footprint, and (3) a great circle coordinate frame that tries to put the track at lat=0. So I'm now thinking that instead of providing footprints based on great circles or lon/lat ranges (lesson learned from the old Pal 5 and GD-1 tracks?), perhaps we could go back to each stream and produce an approximate sky track and associated footprint. Then, for each stream, if it has a defined coordinate system already (e.g., GD-1), we use that, otherwise we determine a GC frame based on the track.

So, with those ideas in mind, my plan was to:

  1. Restructure the (Stream)Footprint class to support doing operations with the footprint (like, testing whether coordinates lie within a stream footprint, etc.).
  2. Go stream-by-stream and extract tracks from papers by either using provided tracks or doing it by eye from plots. As a test, I did this for Phoenix and Carl's 2009 streams (Acheron, Lethe, ...) : for Phoenix (top plot), the orange points are the over-densities reported, and the blue is a polynomial fit to those over-densities. For Carl's streams (bottom plot), I picked out a few locations along the streams and fit a polynomial track. image image
  3. Change the data files so that each stream has a track instead of a specification of a footprint or coordinate frame.
  4. Produce a master table with metadata for each stream, e.g., mean distance, metallicity, age, etc. (maybe you have this already but I started a google sheet in case not).

From the above, we could easily generate the data files you like to have for topcat, so I think it will be "backwards-compatible" in that sense, even if the code structure changes.

Thoughts? This feels like a lot of work, but it is what I've always wanted. Let me know also if you disagree on these ideas :)

cmateu commented 5 years ago

Hey @adrn ,

Wow, this is beyond great! All right, yes it’s major overhaul, but bring it on!!! I’m on board :)

So, for the wish list (1) to (3), agreed. This is more or less what I had in mind. It would be great if we could do this in a structure that would support adding extra along-track information such as [Fe/H], age, as it becomes available. Points 1-4 of the to-do list, agreed. For 6 (Produce a master table with metadata for each stream), yes, I have this internally stored in the library, but I’ll have to go back to the papers to get age and metallicity. I can put this info in the GoogleSheet. For 5 (Change the data files so that each stream has a track instead of a specification of a footprint or coordinate frame), the data files are an output of the library, not an input. They’re created after the streams are defined.

Right now the footprints are defined either as GC-based (pole or end-points, 48 streams), Lon-lat ranges (9, mostly clouds), or by a track or fake-star list (23 streams). In the last case, for most streams I already have the information for the track in my notes. So that’ll save us some work. Also, many of the ones defined by poles or end-points, like Jet, are done so because that’s what the authors report.

The reason I did it like this was so that it would be easy to include new streams as they were published, without touching the library’s code and so minimal (or no) computations were needed before entering them in the library (of which I keep a record in a python notebook). Also, a user could also initialize their own streams with any of these methods and add them to a MWStreams library object in their own python code.

About how to “realize” the track, what do you have in mind? Two options seem useful: a set of points and something that could be interpolated like the polynomial fit. The first, because it is straight-forward. The second, because this non-trivial since we’ll have to be able to define which coordinate is the independent variable, if its DEC(RA) or RA(DEC) or alternatively with l(b), but I think this can be “easily” managed internally with the input being a SkyCoord object. The only caveat I find for this is that, for now, I don’t see how we could handle something like Sgr with multiple wraps, because the track will be naturally multi-valued.

Finally, we might want to deal with clouds/overdensities separately.

So, I’m on board with the overhaul, and my wish list (open for discussion) would be:

Well, I think that’s it from me so far. I had a look at the files you created in the pull request, they look great but I’m not sure in this structure how or where to start storing the info for the tracks. Perhaps I can start by doing this in a separate notebook, so I can recycle the tracks for the streams that already have them.

Let me know what you think.

Thanks this is looking very promising!

nstarman commented 3 years ago

I've had opportunity to use galstreams recently and I like the look of this PR. Are there any plans to implement?

we’ll have to be able to define which coordinate is the independent variable, if its DEC(RA) or RA(DEC) or alternatively with l(b), but I think this can be “easily” managed internally with the input being a SkyCoord object. The only caveat I find for this is that, for now, I don’t see how we could handle something like Sgr with multiple wraps, because the track will be naturally multi-valued.

I actually have a solution for this that produces stream parameterizations independent of sky coordinates and can handle multiple wraps. I have a poster at STScI on the subject. While this will be described in a separate paper (in prep), I would be more than happy to work with you to implement a version within galstreams.

cmateu commented 3 years ago

Hi Nathaniel, sorry the long delay answering. I'm planning to start working on this soon, so I'm very interested in finding out about your stream parameterization. Can you share more about this?