terraref / computing-pipeline

Pipeline to Extract Plant Phenotypes from Reference Data
BSD 3-Clause "New" or "Revised" License
24 stars 13 forks source link

Implement terrautil interfaces for BETY, Clowder, Extractors, and Logging #309

Closed craig-willis closed 7 years ago

craig-willis commented 7 years ago

As outlined in the Terra Utils library discussion, add support for the following to the terrautils library:

Completion criteria

max-zilla commented 7 years ago

started on this: https://github.com/terraref/terrautils/blob/implement-extractor-methods/terrautils/extractors.py

max-zilla commented 7 years ago

@ZongyangLi today on the call I will discuss effort to generalize the stereo-rgb geometry code - I've been comparing it to the code we have for FLIR -> GeoTIFF and have a few questions.

Here's my branch: https://github.com/terraref/terrautils/blob/implement-extractor-methods/terrautils/extractors.py

...but I'll plan to share my screen.

dlebauer commented 7 years ago
ZongyangLi commented 7 years ago

I am going to review flir extractor codes and will update to https://github.com/terraref/extractors-multispectral/blob/master/flir2tif/Get_FLIR.py

ZongyangLi commented 7 years ago

@dlebauer @max-zilla The existed flir extractor will create a 'color coded png' base on the pixel range in the target bin file and a geotiff that store true temperature value.

There is something we need to discuss here:

  1. The png file is just for visualization, there is no real temperature value in it, the existed normalize method only focus on one source data, but not a full screen normalization. so the 'create_png' function is an image processing function rather than a 'image creation' function, we may not consider it as a same function as 'process_image' in bin_to_geotiff.py, I will change the function name and suggest we still use matplotlib to do it.

  2. In the geotiff file, they are all real temperature value by calibration. You will not see any color image in these geotiff creations, I can make some visualization on my side to confirm the full field flir data is correct. I am wondering if that is the exactly thing we are looking for?

Any ideas?

dlebauer commented 7 years ago

@ZongyangLi I believe that the geotiff with temperature values is the key data product that can be used for analysis. Users will be able to bring it into their favorite image processing or GIS software for analysis as well as to create whatever color scale they prefer for interpretation.

The png is useful as a thumbnail. Preferably it would include a color scale, and the scale would be consistent within each day or scan. I think it would make sense to allow the scale to vary across a season since.

The air temperature at time of collection would be useful to capture in the metadata for the FLIR camera; it should be possible to query this from the geostreams database.

ZongyangLi commented 7 years ago

Great!

So the only remaining thing is to determine the scale for each day. I will make it as an input of the function. Then it will be easy to add the air temperature when it is available.

dlebauer commented 7 years ago

@ZongyangLi the air temperature data is available and you should be able to query it with the pyClowder package, though I don't know how ... (@max-zilla?)

dlebauer commented 7 years ago

... and for the scale, would it make sense to stitch the entire field first and then get the range for the scale? Perhaps rounded to the nearest 10s (may take some experimentation)

max-zilla commented 7 years ago

@dlebauer @ZongyangLi if I understand correctly, perhaps we blend these ideas:

  1. convert all flirIrCamera raw bin files to geoTiff, with temp as value (I propose no PNG thumbnails for each dataset, BUT if we generate them it's just a dataset-scaled reference and it won't be used for the full field stitching)
  2. stitch a day of geoTiffs into full field (px values = temp) 2a. query the weather geostreams for the given day to get air temp data - since we are doing the full field stitch, we should have sufficient data from that day for weather as well
  3. get min/max values of full field and use that to determine scaling for PNG
  4. convert full field to PNG w/ scale as thumbnail = nice consistent image

how does that sound?

ZongyangLi commented 7 years ago

@max-zilla That sounds reasonable to me. I will focus on the GIS in flir geotiff creation now.

dlebauer commented 7 years ago

@max-zilla I agree. I would add that at the full field level it will be important to know the temperature (and perhaps time and other environmental variables) for each pixel. These would be one value within each image but variable over the field (though should be highly compressible). I am not sure where it would fit into the workflow, or if it is something 'nice to have' that we should stage for the 2018 season. But at least carying a time stamp for each pixel would be nice (so the temp could be queried).

Three options:

Do nothing

How would a scientist know what air_temperature was at time of collection?

Perhaps if we are focusing on plot level images, it would be sufficient if each plot-level image had a timestamp (this could be the average time of the sample, assuming there was only one pass for the day)

add timestamp

add a timestamp layer to the initial image so that it is retained when it is stitched (?) I have no idea if this is the best way - I suspect other solutions exist. These will be uniform within one image but will vary across the full field stitch.

Really this is most important so users can query any environmental variable at time of image capture.

add timestamp and air_temperature layer

1a. query air temperature from the met data stream at time of photograph and add a layer 'air_temperature'. To keep with CF conventions, the primary layer would be named surface_temperature instead of temperature 1c. stitch these three layers together. The air_temperature and time layers should be highly compressible since it is mosaiced at the resolution of an image

other ways of storing this metadata in the full field stitched image?

Also

I don't want this to block deployment, and it is likely something we should solve for other data products (In the hyperspectral workflow I understand that time is stored as a dimension. But in that case each row of pixels has a different time of capture and the high resolution solar radiance is an important covariate for interpretation of the data. (surface temperature has slower rates of change b/c thermal inertia)

max-zilla commented 7 years ago

@dlebauer my gut feeling is that we'd want 3rd option: query the air temp and time as we convert from bin to geotiff and add to each tile. then the VRT algorithm stitch will select the appropriate temp/time from the source that gets propagated into the field stitching - this would also support down-the-road enhancements like pixel thresholding (so if portions of a tile are removed due to sun reflection, the temp is correctly reported as the visible pixel of the tile beneath).

I'd expect to see a gradient of temp as we progress down the field, perhaps going up and then down if the scan began midday and ended late afternoon.

dlebauer commented 7 years ago

@ZongyangLi does the FLIR camera or the pipeline use emissivity as a parameter? Is this parameter in the sensor metadata (If anything it would go in the variable metadata)?

What value are you using? Are there other parameters / assumptions? We can add this parameter either as a covariate or as part of the method definition. (would be better in the method if we are using a fixed value for everything)

ZongyangLi commented 7 years ago

@dlebauer For now I didn't use any value for the scale. The way of determine scale is re-scale from [min, max] in bin file to [0,1], then remap it into a matplotlib color map.

I don't fine such an emissivity parameter in the metadata.

Another way of storing air temp in geotiff is to add a Band and put the data you want into it.

ZongyangLi commented 7 years ago

@max-zilla flir extractor core code updated here: https://github.com/terraref/extractors-multispectral/blob/master/flir2tif/Get_FLIR.py

max-zilla commented 7 years ago

@nickheyek is going to implement his bety code into terrautils, then we can close this issue as a v1.0 of terrautils.

ghost commented 7 years ago

@nickheyek - please update

nheyek commented 7 years ago

Implemented the BETYdb code into Terrautils, was merged if I'm not mistaken.

jterstriep commented 7 years ago

I still have an open pull request with enhanced support for betydb as well as clipping of full field data sets.

max-zilla commented 7 years ago

I'm cleaning up the PR a bit this morning and will merge shortly.

max-zilla commented 7 years ago

Merged with a fair amount of cleanup. Revised the STATIONS json object quite a bit to support file paths for various things:

>>> from terrautils import sensors
>>> sensors.get_file_paths('ua-mac', 'stereoTop', '2017-01-01__01-01-01-000', stitched=False) 

['/home/extractor/sites/ua-mac/Level_1/stereoTop_geotiff/2017-01-01/2017-01-01__01-01-01-000/stereoTop_lv1_2017-01-01__01-01-01-000_uamac_left.jpg', '/home/extractor/sites/ua-mac/Level_1/stereoTop_geotiff/2017-01-01/2017-01-01__01-01-01-000/stereoTop_lv1_2017-01-01__01-01-01-000_uamac_right.jpg', '/home/extractor/sites/ua-mac/Level_1/stereoTop_geotiff/2017-01-01/2017-01-01__01-01-01-000/stereoTop_lv1_2017-01-01__01-01-01-000_uamac_left.tif', '/home/extractor/sites/ua-mac/Level_1/stereoTop_geotiff/2017-01-01/2017-01-01__01-01-01-000/stereoTop_lv1_2017-01-01__01-01-01-000_uamac_right.tif']

>>> sensors.get_file_paths('ua-mac', 'stereoTop', '2017-01-01__01-01-01-000', stitched=True)          
['/home/extractor/sites/ua-mac/Level_1/fullfield/2017-01-01/stereoTop_fullfield.tif', '/home/extractor/sites/ua-mac/Level_1/fullfield/2017-01-01/stereoTop_fullfield_10pct.tif']