Closed dlebauer closed 4 years ago
What will be most used? lat, lon or x, y in meters? @remotesensinglab?
I think it's fine to have local coordinates in x,y, but you definitely need to add variables for the lat/lon of the reference point for that x,y grid. That said, I'm not sure how things like ncview and panoply will load the image in that case -- probably worth making example files for the different options and try loading them, that might provide your best answer to whether go with x,y vs lat,lon vs both.
Also, using the southeast as the reference and having positive x numbers go west is unintuitive and counter to any other standard I've seen (which uses southwest as the reference and positive X go the East)
In addition the designation of the reference point as being the corner of the 'field' is ambiguous. Are you referencing so specific real-world field or the field of view (i.e. the corner of the image or the corner of the farm)? Referencing relative to the image corner makes the most sense to me, because not all hyperspectral data is for fields.
Finally, it's ambiguous whether the x,y coordinates are the center of a pixel or the corner of the pixel.
Hi Mike, Thanks for your feedback.
It does make sense to have latitude, longitude as dimensions for consistency, and to retain x,y as vectors. Henry does that make sense? For this 200x20m field and in the context of raster data products we can assume the grid mapping is square (but see also https://terraref.gitbooks.io/terraref-documentation/content/user/geospatial-information.html).
This is a special case where x,y are useful. The x heading west is indeed non-intuitive unless you are an engineer designing a field scanner. The pixels are 1mm x 1mm and the coordinates in this particular example are in x,y and are defined by the equipment manufacturer.
The current full metadata (that we plan to update in this issue) is here: https://gist.github.com/dlebauer/4ca36eeae00586bcde36f97579d6fcdf#file-hyperspectral_metadata-c there is a lot of extra metadata below, which is a good idea but I want to focus on the key dimensions and variables here.
@dlebauer If you're trying to store info for one specific instrument that's fine, but the email you sent to the PEcAn team suggested you were trying to generate a standard for storing hyperspectral data, which to me implies a more general standard. For the general case, it would make sense to flip the data into an order that makes sense to end users and can be loaded using existing tools.
@mdietze
I think we agree, though that wasn't clear in the examples (I'm in transit, apologies!) My statement was
It does make sense to have latitude, longitude as dimensions for consistency, and to retain x,y as vectors.
So: latitude and longitude will be dimensions, as with the PEcAn met and model output. I'll update this above.
In addition to the required PEcAn structure, other variables can be added. In this case, we can add x and y as vectors with the same length as the lat and lon dimensions.
@hmb1 I am closing this b/c I think its implemented in both level 1 and indices products
@FlyingWithJerome, @hmb1 @czender it does not look like the level 1 hyperspectral metadata has been updated to this format. e.g. ncdump -h /data/terraref/sites/ua-mac/Level_1/hyperspectral/2017-05-01/2017-05-01_16-47-24-400/e9673701-4a6b-4b6d-b334-f5401dc98213.nc produces level_1_ncdump_json.txt
Is this still in progress?
@dlebauer the current level 1 metadata contains x(x), latitude(x), y(y), longitude(y) etc. all the variables have long_name, units, and most have standard_name. what would you like to see changed? do you want fewer/no corner coordinations listed? a flat hierarchy? full-length names instead of short names, e.g., refectance_image instead of rfl_img? We will add the Geojson bounding box as suggested above but are not sure what other formatting changes to make.
@czender in general, I'd like to clean up the metadata so that the most important data (reflectance + dimensions) are easy to find. This could be done by both organizing the information and removing redundant / extraneous metadata.
So, if I do ncdump -h
the reflectance + dimensions should be easy to find. Currently the first variables are xps_img and Google_Map_View, while reflectance (surface_albedo) is listed way at the end.
I didn't realize how you were using the groups to store the lematec metadata. But now that I see how groups can be used, would it make sense to have additional groups for the geospatial information (corners and reference points) and the calibration information (exposures, calibration data, etc)? I'll leave it to you to decide what is worth keeping around. I think if it is organized - even if it is just ordered correctly - it will not be as distracting.
Although there are issues with the Lemnatec metadata, lets wait until @max-zilla and @craig-willis are done with the metadata cleaner /standardizer tool before touching this.
@czender will move forward with removing all unimportant metadata info
@hmb1 and @czender what is the status on this?
wait for charlie on this one
@FlyingWithJerome what is the status of this? We agreed that you would move all non-essential root group variables to a new and separate group in output.
@hmb1 to follow up
Can xps_img can be recovered from rfl_img and some other field(s) (like rfl_rfr_fct)? If so, can this variable also be dropped from the Level 1 data product to cut the file size in 1/2?
@dlebauer that makes sense. will do these changes first thing tomorrow
xps_img is identical to the raw counts recorded in the Level 0 .bil file, and is recoverable from the raw imagery (which, unlike the Level 1 data, is not in netCDF format). We have asked whether to retain it on a few occasions previously, and the answer has been yes. Would you like to eliminate it from Level 1 always, sometimes (e.g., a switch) or never?
I believe that I have at times confused the raw exposure counts with calibrated radiances. If I recall, the idea behind keeping the xps_img was to facilitate re-calibration by our team or outside users. I think as a group we have shifted from a packrat to a more carefully curated approach to developing data and metadata products.
I am not sure what would be feasible for the Nov. release - but if we were to follow the NASA / MODIS data levels, I would expect to sort data products in the following way.
Where the most important derived data products are the reflectances and indices. I am assuming that it won't be too difficult to separate out xps_img into a separate data product, that might be the best approach. This may take a few edits to the terrautils package https://github.com/terraref/terrautils/blob/master/terrautils/sensors.py#L141
@dlebauer @czender Have removed the excessive intermediate calibration data for default operation 1) the calibrated radiance (default) consists of A) rfl_img B) the coordinate vars "x", "y", "wavelength" C) the converted JSON metatdata in root and associated groups
2) The indices file (default) now contains
A) the index's regular and '_pxl' postfix
B) The coordinate vars "x", "y" , and "wavelength"
C) if the specific reflectances are required they can be obtained from 1)
will add a command line switch such that if required the intermediate variables will available in 1)
I have push the changes to hmb1-patch13 - and will do some more debugging and push to master . tmrw my time
TODO:
The goal is to have an easy-to-interpret metadata that is as consistent as possible across file types (e.g. can be stored and imported / exported among geoTIFF, las, json, and other data types (exif?).
I am open to feedback - I don't necessarily understand all of the rationale behind the previous decisions.
Some questions
Completion Criteria