Closed havok2063 closed 7 years ago
Agreed. I already have a half-finished branch implementing this. My current implementation tries to run a remote request to the API server even if the file has been opened from files. If that fails, it will use the drpall file, if available. I'm implementing this as a utils.general
function, so we could include the NSA information not only for cubes, but for Maps and ModelCubes as well.
When you say "tries to run a remote request to the API server even if the file has been opened from files", does this mean it uses the DB on the server side to retrieve the info? I think we should keep it separate. I think if the user opens a file, Cube, MAPS, etc, and wants to access the NSA info, it should do so via the drpall file. If the user doesn't have it, then the remote API call should essentially just grab the drpall file using sdss_access and download it.
We want to try to make the local file mode as usable as possible even if they are in an offline state.
Yes, that's what I mean.
I see your point but I'm not completely sure. The drpall file is only a subset of the data in the NSA (arguably the most important, but not necessarily). That means that a script that works on remote may not work with file or vice versa. I don't love the idea of giving different answers depending on the mode used.
One possibility is to add this as an kwarg in MarvinCube.init.
Ahh yeah, that's right, it's only a subset of the info. I agree with you that we should be consistent on the information returned across different modes, but this does, to me, essentially break the workflow mode when working with local files. I guess we're breaking a seamless workflow no matter which direction we go in.
What about we have the default in all modes be to only grab the NSA columns used in the drpall file, since that's what people have been using before now anyways, and we add an option when in remote mode to grab the remaining NSA info if desired?
Ok, that sounds good to me. I'll make it use the drpall by default, and the full NSA if a keyword is set.
I just realized that the NSA columns in the db, pulled from the NSA catalog, are different than in the drpall file. Are we creating a mapping between the two somewhere? How are you dealing with that? I'm also now realizing that I already want more default NSA parameters than what is in the drpall file. Like colors.
This notice arose with me trying to compare a color pulled from the NSA db object and calculated from the drpall file, and finding discrepancies.
Do you have an example of parameters that are the same in drpall and NSA but with different column names? I thought the drpall NSA columns had the same name as the NSA ones but with nsa_
in front.
On a related note, I found out (actually I knew it, but I had tried very hard to forget it) that the NSA column names in the drpall for MPL-4 are different than for MPL-5. I suggest not trying to do any kind of mapping.
You can check what I have done for this in PR #116.
I thought they did. I thought we had changed some in the flow of NSA to plateTargets to drpAll? Maybe I'm remembering from the olden times, when they were different. In particular, I'm trying to find the equivalent color information in the drpall file, so I've only focused on one parameter.
In the db, you're generating colors using the nsa.petroth50_el
parameter. What is the equivalent column in the drpall file for this parameter? According to the targeting catalog documentation, https://trac.sdss.org/wiki/MANGA/Survey/CatInfo, this is the half-light radius along the major axis from the elliptical Petrosian fit. According to the the plateTargets and drpAll metadata for MPL-5, https://trac.sdss.org/wiki/MANGA/TRM/TRM_MPL-5/metadata#plateTargets, this parameter is called nsa_elpetro_th50_r
. This parameter is also not an array according to the documentation.
For example, printing nsa.petroth50_el
for 8485-1901 gives FNugriz of
[2.77706241607666,
2.47799253463745,
1.43640887737274,
1.37491261959076,
1.3306667804718,
1.28697717189789,
1.20772969722748]
Computing g-r gives 0.0442 while printing nsa.petroth50_el_g_r
gives -0.0442
. This array does not exist in the drpall file. If instead this parameter is supposed to be based on db nsa.petroflux_el
= drpall file nsa_elpetro_flux
, then I get a rough estimate for g-r as
a=22.5-2.5*np.log10(drp['nsa_elpetro_flux'][1791])
g_r = a[3] - a[4]
print(g_r)
0.64608738584667691
Am I doing something wrong or going crazy?
Update: Ok I found the equivalent of this array in the drpall file.
print(drp['nsa_elpetro_th50_r'][1791])
1.33067
We only keep the r-band half-light radii in the drpall file, rather than the full array. But is this the right parameter to be calculating colors with?
Ok, this is obviously wrong and I'm not completely sure what I was thinking when I wrote that. Evidently you cannot calculate colours with half light radii. At some point I must have gotten confused and thought that petroth50_el were magnitudes. I'll fix that in SampleModelClasses. Science vetting, I keep saying.
Regarding the difference between NSA and drpall, I think that's something we have to live with. Yes, we only keep the half-light radius in the r-band (to make things funnier we have nsa_elpetro_th50_r for elliptical, but nsa_petro_th50 for circular, even if both refer to the r-band). But I think your method for calculating the colour from the Petrosian elliptical flux is correct.
Marvin Cubes should grab the remainder of the NSA parameters, or at least some pre-defined set. Files will have to use the drpall file, while Remote and DB modes can use the cube.target.NSA_objects db object, similar to how we grab the redshift currently.