Closed ZongyangLi closed 7 years ago
@ZongyangLi @rmgarnett @pless
For this extractor, I would suggest that we write the summary stats (histogram) into the metadata and insert a few statistics into BETYdb. For example, we have inserted a trait called '95th quantile height'.
But the key trait from the point cloud is the height estimate calibrated to field measurements. This trait will have the same name as the trait that Maria measured, i.e. 'canopy_height'. I think it would make sense for this extractor to use the calibrated model that Roman developed in #175.
@rmgarnett what are the (slope, intercept) parameters from the model in #175?
When estimating height at the plot level, can we also estimate uncertainty?
[hand height] = 28.2cm + 0.661 * [89th height percentile]
The RMSE/MAE gives a rough estimate of L2/L1 uncertainty. I will do a more thorough analysis in January now that all height distributions are extracted.
@rmgarnett I suspect RMSE scales with height?
From your plot it is hard to tell how the data are distributed b/c of overlapping points. But I gather strongly right-skewed. I wonder if log transforming x and y would be appropriae, if it would more evenly weight the smaller values. The small plants are important too!
@dlebauer I have got all height distribution data for season 2 from 8/8 to 11/25, and I created 90th and 95th height percentile csv file, according to @rmgarnett 's research. 90th percentile 95th percentile Scanner3DTop data in Season 2 is much better than those in Season 1, but still data from 10/13 to 11/04 are unexpected, there are just a few points in those days ply files.
I am wondering if point cloud files might be fixed in those days, if not, what's your opinion of putting them into BETYdb.
@solmazhajmohammadi could you please check into whether we can recover useful data from 10/13 to 11/04?
@ZongyangLi we need to discuss with @rmgarnett about how to implement this extractor.
@rmgarnett have you made any progress on adding uncertainty?
I will pick this up again this week.
On Wed, Jan 11, 2017 at 2:13 AM David LeBauer notifications@github.com wrote:
@rmgarnett https://github.com/rmgarnett have you made any progress on adding uncertainty?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/210#issuecomment-271668753, or mute the thread https://github.com/notifications/unsubscribe-auth/AAjpCbAL13MIg5xXZ0OKTTsr-lbYxIahks5rQ9hBgaJpZM4LIBt3 .
@ZongyangLi you can go ahead and insert the data that you have. We can create another issue for adding uncertainty to the height calculations (moving forward this should be done by default ... )
@dlebauer @ZongyangLi, for the data from 10/13 to 11/04, the png files have not been collected correctly, but we can get the height information from the scans that it is done at ~5m
@smarshall-bmr can you please scan the checker boards to find the pointcloud origin?
@solmazhajmohammadi, are you saying to estimate the plot level height base on the highest points in the remaining 3d data? That might be different from what we have done before, because we are using all point cloud data to create a height histogram and calculate quantiles data to make predictions.
@ZongyangLi This could be an option, otherwise the data has been collected with a wrong setting, so we are not able to recover it.
I have been reinvestigating the hand measurements using @ZongyangLi's most-recent data. The final model may differ from what's written above, but it will be the same form. I presume the extractor will be easy to modify if we wish to change the model slightly?
Yes, we could store parameters as metadata and have the extractor pick them up (eg if they change by crop, year, or location) On Fri, Jan 13, 2017 at 4:21 PM Roman Garnett notifications@github.com wrote:
I have been reinvestigating the hand measurements using @ZongyangLi https://github.com/ZongyangLi's most-recent data. The final model may differ from what's written above, but it will be the same form. I presume the extractor will be easy to modify if we wish to change the model slightly?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/210#issuecomment-272564064, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcX5xs8Q3cOxCiNSQpFMcN6MlWYskaiks5rR_jygaJpZM4LIBt3 .
Perfect.
@robkooper and @max-zilla - should this go into geostreams? clowder too? @ZongyangLi -what will be visualized?
@rmgarnett @ZongyangLi PointCloud data from 2016/08/07 to 2016/09/05 was collected with a wrong setting. There is no way to fix this dataset. Maybe we can delete them or mark them to exclude from the pipeline. @dlebauer @max-zilla any idea?
Please don't remove them unless it is clear that they contain no useful information - i.e. all points have been randomly redistributed. We can keep them but exclude them from our workflow (even if the information is not useful within the current scope of the project, others may find it useful).
One possibility:
On Wed, Feb 8, 2017 at 11:02 AM Solmaz Hajmohammadi < notifications@github.com> wrote:
@rmgarnett https://github.com/rmgarnett @ZongyangLi https://github.com/ZongyangLi PointCloud data from 2016/08/07 to 2016/09/05 was collected with a wrong setting. There is no way to fix this dataset. Maybe we can delete them or mark them to exclude from the pipeline. @dlebauer https://github.com/dlebauer @max-zilla https://github.com/max-zilla any idea?
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/210#issuecomment-278390593, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcX5w1Q6lA3VinMXYbJufMSIO1x86ATks5rafU-gaJpZM4LIBt3 .
I think that's reasonable. Unfortunately, I am not sure these points can be used to reliably estimate height, but they could be useful for some other purpose.
I could try to learn a separate model for this range and for the days afterwards, but I hesitate to do so.
@ZongyangLi @rmgarnett any updates on this issue?
I was worried about providing untrue data into Clowder geostream or BETYdb, so I asked for a transformation matrix to the gantry coordinate system in metadata to get a plot level height histogram. If these uncertain result can be insert into those database, @max-zilla could you send me an instruction of Clowder geostreams?
@ZongyangLi we can write data to geostreams that we can regenerate and replace later once we have corrections to the code, absolutely.
Here is a comment I left in another issue about Geostreams: https://github.com/terraref/computing-pipeline/issues/252#issuecomment-286189327
If you look at the links there, you can see an example of how I use it. The basic approach is:
1) determine which "plot" /sensor to use. I've already created a geostreams sensor entry for each plot, and you can query for the nearest one by lat/long with this: https://github.com/terraref/extractors-metadata/blob/use-pyclowder-geostreams/sensorposition/terra_sensorposition.py#L101
sensor_data = pyclowder.geostreams.get_sensors_by_circle(connector, host, secret_key, sensor_latlon[1], sensor_latlon[0], 0.01)
2) determine which stream to use. you'll want to create a new stream for your data within the plot. e.g. "Height Histogram - Range X Pass Y" (where Range X Pass Y is the name of returned in sensor_data above). Here is a code snippet where we can look for an existing stream with that name and create it if it doesn't exist: https://github.com/terraref/extractors-metadata/blob/use-pyclowder-geostreams/sensorposition/terra_sensorposition.py#L129
stream_data = pyclowder.geostreams.get_stream_by_name(connector, host, secret_key, stream_name)
if not stream_data:
stream_id = pyclowder.geostreams.create_stream(connector, host, secret_key, stream_name, sensor_id, {
"type": "Point",
"coordinates": [sensor_latlon[1], sensor_latlon[0], 0]
})
else: stream_id = stream_data['id']
3) Add datapoints to that stream ID. https://github.com/terraref/extractors-metadata/blob/use-pyclowder-geostreams/sensorposition/terra_sensorposition.py#L161 The "metadata" JSON properties object can have whatever you want - at least the height histogram in this case.
Take a look at the code and let me know if it kind of makes sense. If it's helpful, here's the pyclowder 2 geostreams source code: https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/geostreams.py
@max-zilla Thanks a lot! I will take a look into it and start working on this.
@max-zilla It seems I have to use pyclowder 2. I do update my pyclowder in my laptop, but when I test the example, I got the following message:
python wordcount.py
2017-03-15 15:01:13,756 [MainThread ] INFO : pika.adapters.base_connection - Connecting to 127.0.0.1:5672
2017-03-15 15:01:13,760 [MainThread ] INFO : pika.adapters.blocking_connection - Created channel=1
2017-03-15 15:01:13,877 [MainThread ] INFO : pyclowder.extractors - Waiting for messages. To exit press CTRL+C
2017-03-15 15:01:13,878 [Connector-0 ] INFO : pyclowder.connectors - Starting to listen for messages.sgsd
2017-03-15 15:02:50,139 [Thread-1 ] ERROR : pyclowder.connectors - Error in registering extractor: 400 Client Error: Bad Request for url: http://localhost:9000/api/extractors?key=r1ek3rs
Is there any step I missed?
@ZongyangLi the "error registering extractors" is not a big problem - registering just makes Clowder allow that extractor to be selected in manual extractor lists in the GUI. If you got "Waiting for messages" I think it's working properly.
you do want pyclowder 2, yes
@max-zilla
Could you give me the definition of all the input arguments to the geostreams. Because it seems I need to build all the arguments myself, such as sensor_data stream_name geom
and so on
@ZongyangLi that is included in the geostreams source code: https://opensource.ncsa.illinois.edu/bitbucket/projects/CATS/repos/pyclowder2/browse/pyclowder/geostreams.py
def create_sensor(connector, host, key, sensorname, geom, type, region):
"""Create a new sensor in Geostreams.
Keyword arguments:
connector -- connector information, used to get missing parameters and send status updates
host -- the clowder host, including http and port, should end with a /
key -- the secret key to login to clowder
sensorname -- name of new sensor to create
geom -- GeoJSON object of sensor geometry
type -- JSON object with {"id", "title", and "sensorType"}
region -- region of sensor
"""
def create_stream(connector, host, key, streamname, sensorid, geom, properties={}):
"""Create a new stream in Geostreams.
Keyword arguments:
connector -- connector information, used to get missing parameters and send status updates
host -- the clowder host, including http and port, should end with a /
key -- the secret key to login to clowder
streamname -- name of new stream to create
sensorid -- id of sensor to attach stream to
geom -- GeoJSON object of sensor geometry
properties -- JSON object with any desired properties
"""
def create_datapoint(connector, host, key, streamid, geom, starttime, endtime, properties={}):
"""Create a new datapoint in Geostreams.
Keyword arguments:
connector -- connector information, used to get missing parameters and send status updates
host -- the clowder host, including http and port, should end with a /
key -- the secret key to login to clowder
streamid -- id of stream to attach datapoint to
geom -- GeoJSON object of sensor geometry
starttime -- start time, in format 2017-01-25T09:33:02-06:00
endtime -- end time, in format 2017-01-25T09:33:02-06:00
properties -- JSON object with any desired properties
"""
def get_sensor_by_name(connector, host, key, sensorname):
"""Get sensor by name from Geostreams, or return None.
Keyword arguments:
connector -- connector information, used to get missing parameters and send status updates
host -- the clowder host, including http and port, should end with a /
key -- the secret key to login to clowder
sensorname -- name of sensor to search for
"""
def get_sensors_by_circle(connector, host, key, lon, lat, radius=0):
"""Get sensor by coordinate from Geostreams, or return None.
Keyword arguments:
connector -- connector information, used to get missing parameters and send status updates
host -- the clowder host, including http and port, should end with a /
key -- the secret key to login to clowder
lon -- longitude of point
lat -- latitude of point
radius -- distance in meters around point to search
"""
As an aside, not sure how you're chopping these to plots right now but this task becomes a lot easier after we have a way to stitch + clip images to plots: https://github.com/terraref/computing-pipeline/issues/265
@ZongyangLi is this extractor ready to deploy?
@dlebauer Insert existed 'height' trait into BETYdb is ready, the only thing I need your confirm is using 864 plots or 1728 plots. To make it as an extractor in clowder, I still need some support mentioned here: https://github.com/terraref/computing-pipeline/issues/193#issuecomment-290507753
@ZongyangLi sorry if I missed it, where is code for this extractor? I know the canopycover extractor is in https://github.com/terraref/extractors-stereo-rgb
If you can share what you have I can contribute to the Clowder part.
@max-zilla Insert existed 'height' trait into BETYdb is not an extractor, 'height' data for season 2 is on my desktop, I have a local script can upload them to BETYdb.
To make it as an extractor in clowder there are something I am not sure:
I made a comment here https://github.com/terraref/computing-pipeline/issues/193#issuecomment-270767586 to describe the support I needed and discuss with @dlebauer , but I didn't find the answer to this issue yet. I am sorry too if I missed any response for this.
@ZongyangLi for #2, unless @dlebauer says different I say we do your second option:
Another way of solving this problem is create several traits for a plot for one day if it is acceptable, and that will be much easier.
...when the field stitching is done, we can modify this extractor to trigger on the full merged PLY file and avoid the need to generate multiple traits, but we can do that for now just to have something running.
I pinged Solmaz and Stuart about #1.
@max-zilla This comment is from @dlebauer 'While It is possible to add replicate measurements for a single plot (we are already doing this for field data), we should do this when it is scientifically useful and not because of the way that the data are written.'
PLY chunks... 1) need to consider sampling rates across PLY files so we dont mix and match. 2) better to stitch & subset according to a design. 3) we can deploy what we have now & replace with final version by end of May.
@max-zilla I am going to upload my code associate with plant height. Is there anywhere on Github may I upload and update my code?
@ZongyangLi if using PLY data, please upload here: https://github.com/terraref/extractors-3dscanner
@max-zilla Okay! Could you please create a new directory there?
@ZongyangLi created 'plantheight' directory
I don't see the plantheight directory, but could you change it to plant_height to match the naming in BETYdb?
On Fri, Apr 21, 2017 at 2:43 PM Max Burnette notifications@github.com wrote:
@ZongyangLi https://github.com/ZongyangLi created 'plantheight' directory
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraref/computing-pipeline/issues/210#issuecomment-296289043, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcX510N6jN20XWP-y3ekYu-GXqBroN7ks5ryQbpgaJpZM4LIBt3 .
@dlebauer @ZongyangLi renamed to plant_height: https://github.com/terraref/extractors-3dscanner/tree/master/plant_height
created this issue to finish deployment: https://github.com/terraref/computing-pipeline/issues/303
There was an error scanning the PLY, but code is 90% there. Last missing piece is how to extract the actual values to geostreams - I can create the datapoints if you can show how to get an array of values from the histogram.npy file
@ZongyangLi will mention on phone, but remaining piece is to extract values from height file for BETYdb and histogram file for geostreams. Right now these outputs are just uploaded to Clowder as .npy files.
@max-zilla I think there are still some other issues on this extractor. Maybe we could ignore them for now, but it will not be a true pip-line without solving these problem
@solmazhajmohammadi is going to share the point cloud offset to gantry for #2.
for #1, calibrated in august and validation in November. Saw misalignment between two different point clouds ~4cm. Suspect that alignment is shifting with temperature. @smarshall-bmr will do that scan. maximum we recorded up to now is about ~5cm between hot summer and middle winter.
@ZongyangLi will look at 2017 data for the merge and ignore season 1 issues for the moment.
I will add the geostreams function and then we can deploy initial version.
@smarshall-bmr it seems that the misalignment in pointclouds is due to the temperature change. It would be great if you can do multiple scan in different time of the day to see the variation with changing the temperature. Thanks
@dlebauer I'm going to insert height data into BETYdb. Here is an example of csv file. Could you please review and have a check. Thanks!
https://drive.google.com/open?id=0B5QCp_Onc6nOUGR0WVpsREl5NWc
@zongyangli that looks good. Go ahead and upload. Thanks!
Description
We have scripts to generate plot level height histogram on Roger. The next step is to create a pipeline for this extractor.
Completion Criteria