terraref / reference-data

Coordination of Data Products and Standards for TERRA reference data
https://terraref.org
BSD 3-Clause "New" or "Revised" License
9 stars 2 forks source link

Review "cleaned" LemnaTec Field Scanalyzer metadata #176

Closed craig-willis closed 6 years ago

craig-willis commented 6 years ago

As part of the terrautils implementation we've implemented a process to "clean" or standardize the metadata from the LemnaTec Field Scanalyzer. The "raw" metadata is now stored as a JSON file on ROGER and Clowder, but the Clowder metadata endpoint contains the cleaned metadata, which is now also used downstream by extractors.

Examples are available in the Data Release Trial space for the flirIR and stereoTop sensors.

FLIR:

StereoTop:

As noted in the PR, the primary goal was to clean up many of the inconsistencies in the raw metadata (i.e., Time v time v timestamp v Timestamp).

A few things to note:

Completion criteria:

dlebauer commented 6 years ago

Need to decide how to populate dataset owner, creator, terms of use, description

screen shot 2017-09-07 at 11 17 14 am
dlebauer commented 6 years ago

Should mimetype be mac-binary?

screen shot 2017-09-07 at 11 22 03 am
dlebauer commented 6 years ago

For bounding box - we should calculate rectangle in x,y cordinates and then transform each corner to lat lon. The diversion from rectangularity can be assumed to be 0 in x,y and very small in lat-lon.

For now, we can assume that the cameras are all affixed to have FOV on the x,y rectangle.

dlebauer commented 6 years ago

Review of FLIR metadata https://terraref.ncsa.illinois.edu/clowder/datasets/59b062d94f0ca12ea0c32d04

max-zilla commented 6 years ago

@dlebauer

dlebauer commented 6 years ago

Is there a dictionary that defines metadata fields? For example, it is not clear what 'position_m' means - is it the position of a corner? The position of the center?

dlebauer commented 6 years ago

OK. now looking at the point cloud sample data https://terraref.ncsa.illinois.edu/clowder/datasets/59bac9c44f0c0b27bc3d3132 there is a positoin_m under 'gantry fixed metadata' and a speed. Then under sensor_variable_metadata there is a point_cloud_orign_m, a scan_speed_microMeter/s and a scan_distance_mm.

craig-willis commented 6 years ago

@dlebauer Was there ever a dictionary from LemnaTec for the source fields? I expect the answer is no. Remember, most of these fields are simply an effort to standardize the information already provided by LemnaTec. If they didn't provide a dictionary, we haven't created one.

I can easily update the units on these fields (again, the units are the same as the source data provided by LemnaTec). We'll have to assess whether this impacts any downstream extractors.

The point_cloud_origin_m is something we've added to the metadata (similar to the spatial information) based on https://github.com/terraref/reference-data/issues/44. There seems to be some ongoing confusion over what this is. We should probably discuss in the call this week.

dlebauer commented 6 years ago