Write extractor(s) for NDVI and PRI data

dlebauer commented 8 years ago

Description

Extract NDVI and PRI values, and % reflectance from xml files in ndviSensor and priSensor and insert into geostreams API.

TODO:

[x] Determine any dependencies
[x] Extract values from .bin files (these are actually xml, see below)
[x] ~~Check if time and location are in sync for two sensors (if so, records can be combined)~~
[x] write extractor to convert files with geospatial, time, reflectances and NDVI / PRI values to geostreams API
[x] document
[x] ~~write test.sh to work on sample data and compare to sample output~~ covered elsewhere. #76
[ ] insert plot level summaries into BETYdb

Suggestions

These data, where each record has a time, point location, and then a few data points are similar to the data stream from the pheno tractor. If we can take these both and insert them into the geostreams database, then we will have combined five sensors into one format. From that point it should be easy to write extractors that summarize plot (or any polygon) data from these sensors, and the database already has mapping and plotting features within Clowder.

This is what I am thinking - does this make sense from the Clowder and data stream standardization perspective @robkooper @max-zilla ? Does this make sense in general, and with respect to these datasets @Mamatemenrs, @remotesensinglab ?

Appendix

This is a subtask of #64

NDVI format

<LemnaTecData>
    <Entry name="NDVI" value="0.803190499146552" unit="-" />
    <Entry name="Channel1TOP633" value="233.078997755" unit="umol/m^2/s" />
    <Entry name="Channel4TOP800" value="221.7064438158" unit="umol/m^2/s" />
    <Entry name="Channel1DOWN633" value="3.6708662746" unit="umol/SR/m^2/s" />
    <Entry name="Channel4DOWN800" value="31.9918451094" unit="umol/SR/m^2/s" />
</LemnaTecData>

PRI example

<LemnaTecData>
    <Entry name="NDVI" value="0.047458169870772" unit="-" />
    <Entry name="Channel1TOP531" value="68.817235498" unit="umol/m^2/s" />
    <Entry name="Channel4TOP569" value="65.0497152665" unit="umol/m^2/s" />
    <Entry name="Channel1DOWN531" value="1.1867427872" unit="umol/SR /m^2/s" />
    <Entry name="Channel4DOWN569" value="1.2335518543" unit="umol/SR /m^2/s" />
</LemnaTecData>

dlebauer commented 8 years ago

According to specs, there is one NDVI sensor pointing up, one pointing down: @terraref/lemnatec could you please confirm?

ghost commented 8 years ago

@dlebauer - is this a priority for the V0 release?

dlebauer commented 8 years ago

Yes On Fri, Sep 23, 2016 at 11:46 AM Rachel Shekar notifications@github.com wrote:

@dlebauer https://github.com/dlebauer - is this a priority for the V0 release?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/150#issuecomment-249243261, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcX5834bWKS_-u2WtY_GjmMqGtoW5Luks5qtAJrgaJpZM4Jo3sG .

ZongyangLi commented 8 years ago

I look into bin files in /raw_data/ndviSensor/ several weeks ago. The content in the bin file is like:

<LemnaTecData>
    <Entry name="NDVI" value="0.486768492611564" unit="-" />
    <Entry name="Channel1TOP633" value="259.2523123703" unit="umol/m^2/s" />
    <Entry name="Channel4TOP800" value="230.8212727595" unit="umol/m^2/s" />
    <Entry name="Channel1DOWN633" value="8.0706465413" unit="umol/SR/m^2/s" />
    <Entry name="Channel4DOWN800" value="20.8157258431" unit="umol/SR/m^2/s" />
</LemnaTecData>

In the matlab codes GetNDVI.m, it reads the source file and outputs NDVI value given from source file.

Could some give me some idea on:

How to create a png using these metadata.
Where can I find Vis and NIR information.
Is that NDVI value in the source file is the right NDVI value should be added into somewhere else.
Should I consider each meta file as a pixel to a geospatial creation?

dlebauer commented 8 years ago

@ZongyangLi I just realized that the .bin files are actually XML text and that this is not an imaging sensor. So it may not be appropriate to convert to an image.

The most important value is NDVI. First step might be to extract x, y, NDVI for each observation and look at the points. Perhaps this should be inserted into the Clowder geostreams API as a first step.

@anfrench, @remotesensinglab do you have any suggestions?

Should we retain the four values (2 up and 2 down for 633 and 800 nm) in addition to NDVI or convert these to %reflectances for each channel?

remotesensinglab commented 8 years ago

I do not think it is necessary! Just keep the % reflectance (two values), and NDVI.

anfrench commented 8 years ago

I dont know the instsrument, is this a ratioing algorithm: upwelling divided by downwelling? Unfortunate choice of sensor labels TOP and DOWN since Id think the TOP captures the downwelling radiation and the bottom sensor captures the reflected signal?? Would be good to relabel but I suspect not possible. In that case some note ought to be added to metadata to clarify which sensor is which. Otherwise, two reflectances + NDVI is ok as long as the upwelling and downwelling calibrations are checked periodically, they need to be.

ghost commented 7 years ago

@Mamatemenrs and @remotesensinglab - please update this issue

max-zilla commented 7 years ago

if you have a python script, please upload to GitHub so we can review and prep extractor.

Here is the preferred repository for the script: https://github.com/terraref/extractors-multispectral please let me know if you don't have access. Easiest way would just be to create your own branch and add the file, then create a pull request.

Paheding commented 7 years ago

For extracting NDVI or PRI values from a .bin files, a python script has been uploaded in a branch named "Create binFile_NDVI_Extractor" with a pull request.

max-zilla commented 7 years ago

@Paheding thanks for uploading, I will take a look at follow up.

Paheding commented 7 years ago

@max-zilla The python script for extracting NDVI from .bin files can found in here The code reads all .bin files from current directory, extracts the NDVI values, and saves the extracted values to a .csv file automatically.

max-zilla commented 7 years ago

https://github.com/terraref/extractors-multispectral/pull/6/files

@Paheding I merged your PR and implemented the code into the extractor framework. I have not tested yet but you can see how this was done here.

renamed your script to DirectoryExtractor.py to preserve it.
terra_bin2csv.py is the extractor script with 3 components. this is configured to be notified whenever any .bin file is uploaded into Clowder
- check_message checks metadata to see if the .bin file belongs to NDVI or PRI sensor
- process_message reads the bytes into a CSV and uploads it to clowder
- init just adds a couple command line args
batch_launcher and extractor_info are used in Clowder environment

I will try to make time to test before end of the week.

Paheding commented 7 years ago

Sounds good. Thanks.

Sidike Paheding (Patrick), Ph.D. Postdoctoral Research Associate Center for Sustainability Saint Louis University

On Thu, Jan 19, 2017 at 2:01 PM, Max Burnette notifications@github.com wrote:

https://github.com/terraref/extractors-multispectral/pull/6/files

@Paheding https://github.com/Paheding I merged your PR and implemented the code into the extractor framework. I have not tested yet but you can see how this was done here.

renamed your script to DirectoryExtractor.py to preserve it.

terra_bin2csv.py is the extractor script with 3 components. this is configured to be notified whenever any .bin file is uploaded into Clowder

check_message checks metadata to see if the .bin file belongs to NDVI or PRI sensor

process_message reads the bytes into a CSV and uploads it to clowder

init just adds a couple command line args

batch_launcher and extractor_info are used in Clowder environment

I will try to make time to test before end of the week.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/150#issuecomment-273882301, or mute the thread https://github.com/notifications/unsubscribe-auth/AXtv7-_xJ6dEblhzGtFFn7A3UFPbH-vkks5rT8ELgaJpZM4Jo3sG .

max-zilla commented 7 years ago

Quick update that this extractor is deployed to the VM and dependencies installed - as soon as @jdmaloney has the output directory mounted (so we can write outputs to Level_1 directory properly) I will start running.

The one thing missing mentioned by @dlebauer , which will be difficult to achieve given the separation of sensors in our extractor pipeline, is a merging of NDVI and PRI data if timestamp is the same.

This code will trigger on a dataset and convert the ndvi/pri BIN files it finds, but that will require an additional step of checking if there's a dataset for the OTHER sensor with the same timestamp, checking if the CSV's been created already and merge if so, then put a copy of the CSV in both datasets (?). I kept them separate for now.

Paheding commented 7 years ago

@max-zilla I installed pyClowder2 for extractor development, and ran the example_extractor named "wordcount.py". It gives some error as follows: It seems there is a problem about connection. Any advice on this? Thanks.

robkooper commented 7 years ago

this looks like rabbitmq is not up and running.

Did you install the docker images for rabbitmq and/or clowder?

Paheding commented 7 years ago

@robkooper Yes, I installed them. Now when I ran the extractor named: wordcount.py, the extractor reports "Starting to listen for messages", so it seems working now. Thanks.

robkooper commented 7 years ago

if you upload a text image, does it work? On a mac you need to upload to http://youripaddresss:9000 and not to http://localhost:9000/

dlebauer commented 6 years ago

Is this currently inserting data into geostreams and BETYdb?

terraref / computing-pipeline