sdss / marvin

Data access and visualization for MaNGA. http://sdss-marvin.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
55 stars 32 forks source link

cube tools should grab the rest of the NSA info #108

Closed havok2063 closed 7 years ago

havok2063 commented 7 years ago

Marvin Cubes should grab the remainder of the NSA parameters, or at least some pre-defined set. Files will have to use the drpall file, while Remote and DB modes can use the cube.target.NSA_objects db object, similar to how we grab the redshift currently.

albireox commented 7 years ago

Agreed. I already have a half-finished branch implementing this. My current implementation tries to run a remote request to the API server even if the file has been opened from files. If that fails, it will use the drpall file, if available. I'm implementing this as a utils.general function, so we could include the NSA information not only for cubes, but for Maps and ModelCubes as well.

havok2063 commented 7 years ago

When you say "tries to run a remote request to the API server even if the file has been opened from files", does this mean it uses the DB on the server side to retrieve the info? I think we should keep it separate. I think if the user opens a file, Cube, MAPS, etc, and wants to access the NSA info, it should do so via the drpall file. If the user doesn't have it, then the remote API call should essentially just grab the drpall file using sdss_access and download it.

We want to try to make the local file mode as usable as possible even if they are in an offline state.

albireox commented 7 years ago

Yes, that's what I mean.

I see your point but I'm not completely sure. The drpall file is only a subset of the data in the NSA (arguably the most important, but not necessarily). That means that a script that works on remote may not work with file or vice versa. I don't love the idea of giving different answers depending on the mode used.

One possibility is to add this as an kwarg in MarvinCube.init.

havok2063 commented 7 years ago

Ahh yeah, that's right, it's only a subset of the info. I agree with you that we should be consistent on the information returned across different modes, but this does, to me, essentially break the workflow mode when working with local files. I guess we're breaking a seamless workflow no matter which direction we go in.

What about we have the default in all modes be to only grab the NSA columns used in the drpall file, since that's what people have been using before now anyways, and we add an option when in remote mode to grab the remaining NSA info if desired?

albireox commented 7 years ago

Ok, that sounds good to me. I'll make it use the drpall by default, and the full NSA if a keyword is set.

havok2063 commented 7 years ago

I just realized that the NSA columns in the db, pulled from the NSA catalog, are different than in the drpall file. Are we creating a mapping between the two somewhere? How are you dealing with that? I'm also now realizing that I already want more default NSA parameters than what is in the drpall file. Like colors.

This notice arose with me trying to compare a color pulled from the NSA db object and calculated from the drpall file, and finding discrepancies.

albireox commented 7 years ago

Do you have an example of parameters that are the same in drpall and NSA but with different column names? I thought the drpall NSA columns had the same name as the NSA ones but with nsa_ in front.

On a related note, I found out (actually I knew it, but I had tried very hard to forget it) that the NSA column names in the drpall for MPL-4 are different than for MPL-5. I suggest not trying to do any kind of mapping.

You can check what I have done for this in PR #116.

havok2063 commented 7 years ago

I thought they did. I thought we had changed some in the flow of NSA to plateTargets to drpAll? Maybe I'm remembering from the olden times, when they were different. In particular, I'm trying to find the equivalent color information in the drpall file, so I've only focused on one parameter.

In the db, you're generating colors using the nsa.petroth50_el parameter. What is the equivalent column in the drpall file for this parameter? According to the targeting catalog documentation, https://trac.sdss.org/wiki/MANGA/Survey/CatInfo, this is the half-light radius along the major axis from the elliptical Petrosian fit. According to the the plateTargets and drpAll metadata for MPL-5, https://trac.sdss.org/wiki/MANGA/TRM/TRM_MPL-5/metadata#plateTargets, this parameter is called nsa_elpetro_th50_r. This parameter is also not an array according to the documentation.

For example, printing nsa.petroth50_el for 8485-1901 gives FNugriz of

[2.77706241607666,
 2.47799253463745,
 1.43640887737274,
 1.37491261959076,
 1.3306667804718,
 1.28697717189789,
 1.20772969722748] 

Computing g-r gives 0.0442 while printing nsa.petroth50_el_g_r gives -0.0442. This array does not exist in the drpall file. If instead this parameter is supposed to be based on db nsa.petroflux_el = drpall file nsa_elpetro_flux, then I get a rough estimate for g-r as

a=22.5-2.5*np.log10(drp['nsa_elpetro_flux'][1791])
g_r = a[3] - a[4]
print(g_r)
0.64608738584667691

Am I doing something wrong or going crazy?

Update: Ok I found the equivalent of this array in the drpall file.

print(drp['nsa_elpetro_th50_r'][1791])
1.33067

We only keep the r-band half-light radii in the drpall file, rather than the full array. But is this the right parameter to be calculating colors with?

albireox commented 7 years ago

Ok, this is obviously wrong and I'm not completely sure what I was thinking when I wrote that. Evidently you cannot calculate colours with half light radii. At some point I must have gotten confused and thought that petroth50_el were magnitudes. I'll fix that in SampleModelClasses. Science vetting, I keep saying.

Regarding the difference between NSA and drpall, I think that's something we have to live with. Yes, we only keep the half-light radius in the r-band (to make things funnier we have nsa_elpetro_th50_r for elliptical, but nsa_petro_th50 for circular, even if both refer to the r-band). But I think your method for calculating the colour from the Petrosian elliptical flux is correct.