Closed dlebauer closed 8 years ago
Is there a way to get the geo location of the weather station or sensor, and link station id with the observations?
@markus-radermacher-lemnatec
It appears that the json files are invalid -
[2016-04-11]$ jsonlint 2016-04-11_01-17-18_enviromentlogger.json
[Error: Parse error on line 4151:
"environment_
--------------------^
Expecting 'EOF', got ',']
One way to fix this:
[]
(make file begin with [
, and end with ]
),
: change }{
to },{
echo "[`cat 2016-04-11_01-17-18_enviromentlogger.json` ]" | sed 's/}{/},{/g' > test.json
jsonlint test.json
## OK
However, we should discuss the more general changes in the logger above before implementing this simple fix.
The CO2 sensor data have .bil extensions and each observation is written into a separate file; these observations should have timestamp + concentration and should be saved as a time series at daily or hourly time steps.
irradiance/spectrum as a .bil file might be helpful for other purposes, but a 1-D json array would be easier for the hyperspectral workflow.
regarding 1) It's just a setting within the software that easily can be changed to 1h or 1d. Right now the time between two files should be long enough to not create to many files and short enough to ensure that in case of a problem not to many data points are lost. The environment logger software in the current state is very simple but it works well and i prefer to keep it like it is for now. I will set the the timespan of each file to 1h, ok?
regarding 2) Could you provide an example for your preferred environment logger output file. Then we change the output according to your template. Right now the output file format is just a first suggestion.
regarding 3) If you prefer to have the spectrum to be written into a separate file, please provide an example file.
CO2 sensor: right now it is located inside the camera box, thus by definition it is not an ambient sensor. Its distance to the canopy is round about 2m during measurements (except fluorescence: than it is closer to 1m). The other ambient sensor a mounted on top of the gantry, facing the open sky with a rough distance of 5m above even the grown up plants.
It is possible to get it's values from the both software parts, the environment logger and the moving sensor used in the gantry script. What do you need.
regarding 4)
Using the same extension bin for most of the sensors is just for consistency. The idea behind that is that the file extension should not be used directly, instead there is a description of the file in the meta data, "Output data format": "text/xml" in this case.
I guess you have to set up a data workflow for each sensor individually, so this should not be a problem. If this .bin extension is a problem let me know and we work out a solution for that.
regarding 5)
whats the difference to 3) Could you give a clear definition.
regarding 6)
could you send me the original file, because of the ongoing repair of the gantry the remote access is shut off. Most probably the software has been terminated during file writing, but I will check.
@markus-radermacher-lemnatec
[]
and separating }{
by a comma is detailed in the comment above. This is simple enough to implement immediately. However, if we change the output it may be unnecessary.@markus-radermacher-lemnatec @dlebauer @FlyingWithJerome Jerome is working to parse the environmental sensor data for combination with the hyperspectral imager data. He finds the ES data is not properly formatted (below). Can Lemnatic please address this so we do not need to write workarounds? Thanks! Jerome says: Right now I'm working on parsing the environmental logger, and I noticed that there's one problem.
Each JSON file includes multiple JSON objects, but they are just simply added together instead of in a JSON array. The case is if I directly read it the Python will ignore everything but the last one, since it is illegal in RFC standard.
As a solution, I had a formatting function to re-format the JSON file into a JSON array. It works, but I really concern on runtime efficiency, since the re-formatting costs .3ms for each file according to the result on bash.
So could you ask the environmental logger personnel to export a JSON array for each file instead of multiple JSON objects? It would explicitly boost the runtime efficiency especially when we have lots of JSON files and each file has over 161,800 lines.
@czender is more than sed 's/}{/},{/g'
required to correct the error?
@markus-radermacher-lemnatec what is the timeline for fixing this?
@FlyingWithJerome will answer your question @dlebauer i'm just a messenger :)
@dlebauer @czender Sorry, I just noticed this discussion thread. I reformatted them like this: [ {"environment_sensor_set_reading": {...}} , { "environment_sensor_set_reading": {...}}, {"environment_sensor_set_reading" : {...}} , ...... ] so yes, as you mentioned above, adding square brackets and commas would fix this problem.
@dlebauer , I cannot do anything about it. I need to push Markus.
@dlebauer the output format for the environment logger will be changed as suggested to a json list, the fix will be available from the 3th of May on.
@dlebauer Hi this is André from LemnaTec, since Markus is ill this week I will jump in to fulfill our promise. Did I understand it correct that in first place you guys are happy if the environment logging json files are valid and as json Arrays?
@Ndrey yes, it is okay if the environment logging is provided as valid json files. It isn't clear what you mean by as arrays (could you paste an example?) but even key-value pairs are okay.
That is okay for format, but for content and frequency please see additional comments above.
@dlebauer With json array I only meant to put the elements in square brackets and seperate them with commas as suggested.
Markus already changed it to a lower frequency, so each hour a new file is created instead of each 2 minutes, but the file size is then around 170 MB each...
@Ndrey @dlebauer to be a bit more general, please ensure that all sensor files indicated (by .json suffix) as storing JSON are actually valid JSON. Once we convert them to netCDF, their size will be significantly reduced.
See the updated output of the environment logger to a more structured format now. Could you confirm, that the layout of the new format fits your needs and meets the json standard. After your confirmation I will update the software on the gantry system.
See below the updated output of the environment logger to a more structured format now. Could you confirm, that the layout of the new format fits your needs and meets the json standard. After your confirmation I will update the software on the gantry system.
{ "environment_sensor_fixed_infos": [ { "par_sensor": { "fixed_info_0": "...", "fixed_info_1": "..." }, "weather_station": { "fixed_info_0": "...", "fixed_info_1": "..." }, "spectrometer": { "fixed_info_0": "...", "fixed_info_1": "..." } } ], "environment_sensor_readings": [ { "timestamp": "2016.05.04-15:00:04", "weather_station": { "sunDirection": { "value": "error", "unit": "error", "rawValue": "error" }, "airPressure": { "value": "error", "unit": "error", "rawValue": "error" }, "brightness": { "value": "error", "unit": "error", "rawValue": "error" }, "relHumidity": { "value": "error", "unit": "error", "rawValue": "error" }, "temperature": { "value": "error", "unit": "error", "rawValue": "error" }, "windDirection": { "value": "error", "unit": "error", "rawValue": "error" }, "precipitation": { "value": "error", "unit": "error", "rawValue": "error" }, "windVelocity": { "value": "error", "unit": "error", "rawValue": "error" } }, "sensor par": { "value": "error", "unit": "error", "rawValue": "error" }, "spectrometer": { "maxFixedIntensity": "123", "integration time in µs": "123", "wavelength": [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ], "spectrum": [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ] } }, { "timestamp": "2016.05.04-15:00:05", "weather_station": { "sunDirection": { "value": "error", "unit": "error", "rawValue": "error" }, "airPressure": { "value": "error", "unit": "error", "rawValue": "error" }, "brightness": { "value": "error", "unit": "error", "rawValue": "error" }, "relHumidity": { "value": "error", "unit": "error", "rawValue": "error" }, "temperature": { "value": "error", "unit": "error", "rawValue": "error" }, "windDirection": { "value": "error", "unit": "error", "rawValue": "error" }, "precipitation": { "value": "error", "unit": "error", "rawValue": "error" }, "windVelocity": { "value": "error", "unit": "error", "rawValue": "error" } }, "sensor par": { "value": "error", "unit": "error", "rawValue": "error" }, "spectrometer": { "maxFixedIntensity": "123", "integration time in µs": "123", "wavelength": [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ], "spectrum": [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ] } }, { "timestamp": "2016.05.04-15:00:05", "weather_station": { "sunDirection": { "value": "error", "unit": "error", "rawValue": "error" }, "airPressure": { "value": "error", "unit": "error", "rawValue": "error" }, "brightness": { "value": "error", "unit": "error", "rawValue": "error" }, "relHumidity": { "value": "error", "unit": "error", "rawValue": "error" }, "temperature": { "value": "error", "unit": "error", "rawValue": "error" }, "windDirection": { "value": "error", "unit": "error", "rawValue": "error" }, "precipitation": { "value": "error", "unit": "error", "rawValue": "error" }, "windVelocity": { "value": "error", "unit": "error", "rawValue": "error" } }, "sensor par": { "value": "error", "unit": "error", "rawValue": "error" }, "spectrometer": { "maxFixedIntensity": "123", "integration time in µs": "123", "wavelength": [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ], "spectrum": [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0 ] } } ] }
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/terraref/reference-data/issues/26#issuecomment-216868530
please at the same time fix the spelling of the filenames from "...enviromentlogger.json" to "...environmentlogger.json".
The output of the spectrometer is 'raw' counts.
You need to use the attached calibration files to convert it to units of µW m-2 s-1. Careful you need to take the bandwidth of the chip into account (0.4nm) if you want to convert to µmol m-2 s-1.
I added the calibration files to the gantry ftp:
/gantry_data/LemnaTec/EnvironmentLogger/CalibrationData
@dlebauer The file Markus pasted here passed the json validator from Newtonsoft as well as the http://jsonlint.com/ validator
@dlebauer @czender Here are the two Calibration files that Tino mentioned: Calibrations.zip
..and here is the latest EnvironmentalLogger json file, from this morning (5/5): 2016-05-05_07-20-52_enviromentlogger.json.zip
e: @czender doesn't look like the "enviROMent" typo was fixed yet, FYI.
@markus-radermacher-lemnatec @TinoDornbusch could you make sure to correct the spelling of enviROMent in the environmentlogger file?
@dlebauer. I have no access to the source code to make these changes. I try to bother Markus on holiday.
@markus-radermacher-lemnatec @TinoDornbusch What is the status of the CO2 sensor - are you going to fix it on the gantry and combine the data stream with the environmental logger?
Could someone help out with a batch rename script for ftp to rename the environment jsons?
@TinoDornbusch I've made a separate issue (#29) for the fixing and renaming of the logger files.
@FlyingWithJerome Please alter EnvironmentalLoggerAnalyser.py to work with 2016-05-05_07-20-52_enviromentlogger.json as new default filetype. it should not have any JSON issues, and so should be opened read-only (not r+, which causes failure on Roger computer since we don't have write permission on input directory). Print a warning and exit if there is a JSON issue. Assume from now on that all logger files have valid JSON, meaning that old logger files will have been previously run through the batch script that @dlebauer mentions above to fix the JSON (maybe you can help him with that?)
@czender I just tested it and EnvironmentalLoggerAnalyser.py can deal with it without any changing. The total running time is 26.894s and the output netCDF is 7.7MB. However, we should not remove the reformatting function or changing the reading mode. 2016-05-05_07-20-52_enviromentlogger.json is not a valid JSON. We still need to reformat the "environmental_sensor_set_reading" to an array. If we do not reformat it, we will lose 1757 out of 1758 readings but the last one.
@FlyingWithJerome we are going to fix all of the invalid files (#29) so you should not have to deal with them ...
@dlebauer I'm sorry and I just noticed that, thank you!
@FlyingWithJerome all of the environmental data through 2015-04-13 have been corrected. I've checked that these files are valid json. Could you please check that your script works with these?
Spelling error is fixed from 6.5.2016 on.
@dlebauer Yes, I rerun the script after removing the reformatting function. It works well, thank you! There's a wrinkle for me and seems it is OS X only. "sed" command needs one more option, so I added an empty string to solve it: sed -i '' 's/}{/},{/g' $file
Either the logger data on Roger have not been updated, or there is a problem with the updated files, or there is a problem with EnvironmentalLoggerAnalyser.py. Same problem with 2016-04-07 files:
ender@cg-gpu01:~$ python ${HOME}/terraref/computing-pipeline/scripts/hyperspectral/EnvironmentalLoggerAnalyser.py /projects/arpae/terraref/raw_data/ua-mac/EnvironmentLogger/2016-05-02/2016-05-02_12-10-52_enviromentlogger.json ~/rgr
Processing /projects/arpae/terraref/raw_data/ua-mac/EnvironmentLogger/2016-05-02/2016-05-02_12-10-52_enviromentlogger.json....
Traceback (most recent call last):
File "/home/zender/terraref/computing-pipeline/scripts/hyperspectral/EnvironmentalLoggerAnalyser.py", line 212, in <module>
fileInputLocation)
File "/home/zender/terraref/computing-pipeline/scripts/hyperspectral/EnvironmentalLoggerAnalyser.py", line 101, in JSONHandler
return json.loads(fileHandler.read()), wavelength, spectrum
File "/sw/python-2.7.10/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/sw/python-2.7.10/lib/python2.7/json/decoder.py", line 369, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 4151 column 2 - line 614201 column 2 (char 101524 - 15025186)
@czender per https://github.com/terraref/reference-data/issues/26#issuecomment-217488670 above "all of the environmental data through 2015-04-13 have been corrected."
From the command you executed, it looks like you were trying to process data from 2016-05-02. These data are still transferring, and I will fix them after we get the last of the old malformed enviromentlogger files -
@dlebauer ... The CO2 sensor moves and is hence a moving sensor. Relocating it to the top of the gantry and change the datastream require time and work ressources. I cannot do that.
I find measuring the CO2 concentration close to the canopy a valuable measurement
@dlebauer Of course if you want that we implement this, but it will take some time.
Moreover we are in measurement campaign and gantry downtimes should be minimal.
My suggestion would be to do that after the experiment along with other upgrades.
I find measuring the CO2 concentration close to the canopy a valuable measurement
That is what I am confused about - it is not clear what information this will provide? I don't think it will be possible to resolve plot-scale (~2x4 m plots) effects on atmospheric [CO2] at > 2m above the canopy. LiCOR provides an excellent book explaining the theory of eddy covariance, but we are not set up to use that technique. I am afraid that the confounding effects of moving the sensor around will make it more difficult to estimate the ambient [CO2] above the canopy layer that would otherwise be useful as a boundary condition / driver for crop modeling in the same way light, rain, temperature etc can be used.
It isn't the most essential sensor so if it is too much trouble to move this year that is okay. But it would be nice if the files were written out as a time series and saved hourly with time, concentration, and position in x,y,z space rather than a new folder + metadata every 5 seconds.
@TinoDornbusch
the environmental sensor data still writes out "?" which I suspect should be "micro"?
"sensor par": {
"value": "258.7112684654",
"unit": "?mol/(m^2*s)",
"rawValue": "5.65840556708583"
},
"spectrometer": {
"maxFixedIntensity": "16383",
"integration time in ?s": "5000",
@dlebauer ...yes it is µ... I will have our IT guys fix this.
you should get units of umol and us
@dlebauer CO2 sensor is now in the environmentlogger.json. I still have it implemented in the moving sensor data acquisition. Will remove if you do not wish to have positional information.
Sensor will be moved on top of the gantry during winter upgrades.
{ "environment_sensor_fixed_infos": { "par_sensor": { "manufacturer": "www.apogeeinstruments.com", "model": "SQ214", "location in gantry system": "on top" }, "co2_sensor": { "sensor manufacturer": "Vaisala", "model": "Carbocap CO2 Probe GMP343 A1C1B0N0N0B", "sensor serial number": "L3420008", "additional info": "SO 5530060878", "calbration date": "2015.08.18", "location in gantry system": "camera box", "analog digital interface": "WAGO 750-478"
The first environmental data samples (e.g. 2016-02-15_21-20-08_enviromentlogger.json.txt) are in a json key:value format.
I propose the following changes:
spectrum
andwavelength
but nothing measuringirradiance
@markus-radermacher-lemnatec