terraref / computing-pipeline

Pipeline to Extract Plant Phenotypes from Reference Data
BSD 3-Clause "New" or "Revised" License
24 stars 13 forks source link

hdf/netcdf extractor for clowder #145

Closed robkooper closed 7 years ago

robkooper commented 8 years ago

Given the netcdf files we receive we should have an extrctor that takes the properties of the netcdf/hdf file and inserts them as metadata in clowder.

Another user in the CyberGIS group has brought this up as well as an interesting item.

Also see https://opensource.ncsa.illinois.edu/jira/browse/CATS-628

yanliu-chn commented 8 years ago

do you want the generic metadata for netcdf? here is the output of gdalinfo from @czender Charles' hyperspectral netcdf output, see if this is what you want:

Driver: HDF5Image/HDF5 Dataset
Files: output/0596c17f-2e4c-4d43-9d77-cde8ffbde663.nc
Size is 1600, 468
Coordinate System is `'
Metadata:
  Conventions=CF-1.5
  created_by=ubuntu
  gantry_system_fixed_metadata_gantry_fixed_data_1=Todo
  gantry_system_fixed_metadata_gantry_fixed_data_2=Todo
  gantry_system_fixed_metadata_System_manufacturer=LemnaTec Corp.
  gantry_system_variable_metadata_Camnera_box_light_1_is_on=True
  gantry_system_variable_metadata_Camnera_box_light_2_is_on=True
  gantry_system_variable_metadata_Camnera_box_light_3_is_on=True
  gantry_system_variable_metadata_Camnera_box_light_4_is_on=True
  gantry_system_variable_metadata_Gantry_Speed_in_]_Direction=0
  gantry_system_variable_metadata_Position_in_]_Direction=0.97
  gantry_system_variable_metadata_Time=04/07/2016 16:15:45
  header_info_AOI_height=960
  header_info_AOI_left=480
  header_info_AOI_top=600
  header_info_AOI_width=1600
  header_info_Array_Pixel_Pitch=6.5
  header_info_AverageDispersion=0.63986398
  header_info_bands=955
  header_info_byte_order=0
  header_info_Col_binning=1
  header_info_data_type=12
  header_info_default_bands={140,234,500}
  header_info_description={[HEADWALL Hyperspec III]}
  header_info_file_type=ENVI Standard
  header_info_FrameIndex=frameIndex.txt
  header_info_header_offset=0
  header_info_HSIII_VERSION=E51215 vs64
  header_info_interleave=bil
  header_info_Lens_EFL=17
  header_info_Lens_folder=
  header_info_lines=468
  header_info_Nuc_folder=
  header_info_Pixel0=3.100546185
  header_info_POST_AOI_height=955
  header_info_POST_AOI_left=0
  header_info_POST_AOI_top=5
  header_info_POST_AOI_width=1600
  header_info_POST_Col_binning=1
  header_info_POST_Row_binning=1
  header_info_Row_binning=1
  header_info_samples=1600
  header_info_sensor_type=Unknown
  header_info_Serial_Number=SN-G4-384
  history=Wed Aug 17 23:14:41 2016: ncks -A /tmp/terraref_tmp_jsn.nc.pid3692.fl0
0.tmp /tmp/terraref_tmp_att.nc.pid3692.fl00.tmp
Wed Aug 17 23:14:01 2016: python /home/ubuntu/terraref-hyperspectral-input-sampl
e/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /tmp/terraref_tmp_jsn.nc.pid3692.fl00
.tmp
  history_of_appended_files=Wed Aug 17 23:14:41 2016: Appended file /tmp/terrare
f_tmp_jsn.nc.pid3692.fl00.tmp had following "history" attribute:
Wed Aug 17 23:14:01 2016: python /home/ubuntu/terraref-hyperspectral-input-sampl
e/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /tmp/terraref_tmp_jsn.nc.pid3692.fl00
.tmp
  NCO="4.6.1"
  Project=TERRAREF
  sensor_fixed_metadata_sensor_description=Todo
  sensor_fixed_metadata_sensor_manufacturer=Headwall Scientific
  sensor_fixed_metadata_sensor_product_name=VNIR
  sensor_fixed_metadata_sensor_purpose=Todo
  sensor_fixed_metadata_sensor_serial_number=Todo
  sensor_variable_metadata_constmirrorpos=0
  sensor_variable_metadata_createdatacube=0
  sensor_variable_metadata_exposure=45
  sensor_variable_metadata_frameperiod=50
  sensor_variable_metadata_speed=100
  sensor_variable_metadata_startpos=-70
  sensor_variable_metadata_stoppos=70
  sensor_variable_metadata_useexternaltrigger=0
  sensor_variable_metadata_userotatingmirror=0
  terraref_hostname=hyperspectral-ex-vm
  terraref_script=terraref.sh
  terraref_version=4.6.1
  title=None given (supply with --trr ttl="Title")
  user_given_metadata_and_so_on_and_so_on...=...
  user_given_metadata_experiment_info_1=...
  user_given_metadata_first_wheat_test_by_Markus_Radermacher=
  _NCProperties=version=1|netcdflibversion=4.4.1|hdf5libversion=1.8.17
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  468.0)
Upper Right ( 1600.0,    0.0)
Lower Right ( 1600.0,  468.0)
Center      (  800.0,  234.0)
Band 1 Block=1600x1 Type=UInt16, ColorInterp=Undefined
  Metadata:
    xps_img_long_name=Exposure counts
    xps_img_meaning=Counts on scale from 0 to 2^16-1 = 65535
    xps_img_units=1
    xps_img__Netcdf4Dimid=0
Band 2 Block=1600x1 Type=UInt16, ColorInterp=Undefined
  Metadata:
    xps_img_long_name=Exposure counts
    xps_img_meaning=Counts on scale from 0 to 2^16-1 = 65535
    xps_img_units=1
    xps_img__Netcdf4Dimid=0
Band 3 Block=1600x1 Type=UInt16, ColorInterp=Undefined
  Metadata:
    xps_img_long_name=Exposure counts
    xps_img_meaning=Counts on scale from 0 to 2^16-1 = 65535
    xps_img_units=1
    xps_img__Netcdf4Dimid=0
Band 4 Block=1600x1 Type=UInt16, ColorInterp=Undefined
  Metadata:
    xps_img_long_name=Exposure counts
    xps_img_meaning=Counts on scale from 0 to 2^16-1 = 65535
    xps_img_units=1
    xps_img__Netcdf4Dimid=0
Band 5 Block=1600x1 Type=UInt16, ColorInterp=Undefined
  Metadata:
    xps_img_long_name=Exposure counts
    xps_img_meaning=Counts on scale from 0 to 2^16-1 = 65535
    xps_img_units=1
    xps_img__Netcdf4Dimid=0
Band 6 Block=1600x1 Type=UInt16, ColorInterp=Undefined
  Metadata:
    xps_img_long_name=Exposure counts
    xps_img_meaning=Counts on scale from 0 to 2^16-1 = 65535
    xps_img_units=1
    xps_img__Netcdf4Dimid=0
...
czender commented 8 years ago

FYI gdalinfo above collapses (loses) the group structure of the netCDF4 metadata, and represents it as a flat namespace by joining path names together with underscores. For example, gantry_system_fixed_metadata is a group.

yanliu-chn commented 8 years ago

i see. @czender , is there a tool to extract structured metadata for netcdf?

czender commented 8 years ago

This extracts all metadata: ncks --cdl -m -M /home/zender/a33641c2-8a1e-4a63-9d33-ab66717d6b8a.nc This extracts only metadata pertinent to variable "y": ncks --cdl -v y -m /home/zender/a33641c2-8a1e-4a63-9d33-ab66717d6b8a.nc

yanliu-chn commented 8 years ago

cool. thanks! @czender does ncks support json output format?

czender commented 8 years ago

No, ncks does not output json. It can output CDL and NcML.

yanliu-chn commented 8 years ago

i see. Thanks!

dlebauer commented 8 years ago

FYI ncdump-json on github.

robkooper commented 8 years ago

Another option is to use https://github.com/hay/xml2json and do ncks -xml file | xml2json

gsrohde commented 8 years ago

If you do use that library, you should be aware that it throws away some of the XML information. For example, it converts the ordered sequence of elements <a>text</a ><b>text</b> into the JavaScript object (hash) { "a": "text", "b": "text" }. If the order of elements in your XML documents is not significant, then of course this doesn't really matter. But there may be other cases where information is thrown away, cases not so readily discernible from the documentation. I'm not that familiar with the netcdf/hdf formats, so I can't really evaluate ncdump-json in this regard.

ghost commented 8 years ago

convert xml to JSON

czender commented 8 years ago

Typo above: the ncks commands to produce XML output are

ncks --xml in.nc # entire file
ncks --xml -m in.nc # variable and group metadata
ncks --xml -m -M # variable and group and global metadata

Hopefully you can pipe these to xml2json as Rob suggests...

Zodiase commented 8 years ago

Do we want the entire output file or just some specific variables or groups?

Also is there a metadata field name that should be used? The entire output JSON be stuffed in there.

czender commented 8 years ago

The people requesting the feature should answer this :)

yanliu-chn commented 8 years ago

@dlebauer ncks -m output:

netcdf 0596c17f-2e4c-4d43-9d77-cde8ffbde663 {
  dimensions:
    wavelength = 955 ;
    wvl_nvr = 1024 ;
    x = 1600 ;
    y = 468 ;
  variables:
    float flx_dwn(wavelength) ;
      flx_dwn:long_name = "Downwelling spectral irradiance" ;
      flx_dwn:standard_name = "surface_downwelling_radiative_flux_per_unit_wavelength_in_air" ;
      flx_dwn:units = "watt meter-2 meter-1" ;
    float flx_sns(wvl_nvr) ;
      flx_sns:long_name = "Flux sensitivity of each band (irradiance per count)" ;
      flx_sns:provenance = "EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434" ;
      flx_sns:units = "joule count-1" ;
    double frametime(y) ;
      frametime:units = "days since 1970-01-01 00:00:00" ;
      frametime:calender = "gregorian" ;
    float rfl_img(wavelength,y,x) ;
      rfl_img:long_name = "Reflectance of image" ;
      rfl_img:meaning = "Counts on scale from 0 to 2^16-1 = 65535" ;
      rfl_img:standard_name = "surface_albedo" ;
      rfl_img:units = "1" ;
    float rfl_wht(wavelength) ;
      rfl_wht:long_name = "Reflectance of white reference" ;
      rfl_wht:units = "1" ;
    double wavelength(wavelength) ;
      wavelength:long_name = "Hyperspectral Wavelength" ;
      wavelength:units = "meter" ;
      wavelength:standard_name = "radiation_wavelength" ;
    float wvl_dlt(wvl_nvr) ;
      wvl_dlt:long_name = "Bandwidth of environmental sensor" ;
      wvl_dlt:notes = "Bandwidth, also called dispersion, is between 0.455-0.495 nm across all channels. Values computed as differences between midpoints of adjacent band-centers." ;
      wvl_dlt:standard_name = "bandwidth" ;
      wvl_dlt:units = "meter" ;
    float wvl_nvr(wvl_nvr) ;
      wvl_nvr:long_name = "Wavelength of environmental sensor" ;
      wvl_nvr:provenance = "EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434" ;
      wvl_nvr:standard_name = "sensor_band_central_radiation_wavelength" ;
      wvl_nvr:units = "meter" ;
    double x(x) ;
      x:algorithm = "CSZ implemented these fake data to be replaced by real formula once available." ;
      x:long_name = "North-south offset from start position" ;
      x:units = "meter" ;
    ushort xps_drk(wavelength,x) ;
      xps_drk:long_name = "Exposure from dark reference sheet/panel" ;
      xps_drk:units = "Counts on scale from 0 to 2^16-1 = 65535" ;
    ushort xps_img(wavelength,y,x) ;
      xps_img:long_name = "Exposure counts" ;
      xps_img:meaning = "Counts on scale from 0 to 2^16-1 = 65535" ;
      xps_img:units = "1" ;
    ushort xps_wht(wavelength,x) ;
      xps_wht:long_name = "Exposure from white reference sheet/panel" ;
      xps_wht:units = "Counts on scale from 0 to 2^16-1 = 65535" ;
    double y(y) ;
      y:algorithm = "Based on https://github.com/terraref/computing-pipeline/issues/144. y is defined as 0.9853 mm per pixel. Exact number is 0.98526434004512529576754637665 mm." ;
      y:long_name = "East-west offset from start position" ;
      y:units = "meter" ;
  group: gantry_system_fixed_metadata {
  } // group /gantry_system_fixed_metadata
  group: gantry_system_variable_metadata {
    variables:
      double u ;
        u:long_name = "Gantry_Speed_in_X_Direction" ;
        u:units = "meter second-1" ;
      double v ;
        v:long_name = "Gantry_Speed_in_Y_Direction" ;
        v:units = "meter second-1" ;
      double w ;
        w:long_name = "Gantry_Speed_in_Z_Direction" ;
        w:units = "meter second-1" ;
      double x ;
        x:long_name = "Position_in_X_Direction" ;
        x:units = "meter" ;
      double y ;
        y:long_name = "Position_in_Y_Direction" ;
        y:units = "meter" ;
      double z ;
        z:long_name = "Position_in_Z_Direction" ;
        z:units = "meter" ;
  } // group /gantry_system_variable_metadata
  group: header_info {
    variables:
      double blue_band_index ;
      double green_band_index ;
      double red_band_index ;
  } // group /header_info
  group: sensor_fixed_metadata {
  } // group /sensor_fixed_metadata
  group: sensor_variable_metadata {
    variables:
      double constmirrorpos ;
        constmirrorpos:long_name = "constmirrorpos" ;
      double createdatacube ;
        createdatacube:long_name = "createdatacube" ;
      double exposure ;
        exposure:long_name = "exposure" ;
        exposure:red_band_index = 140l ;
        exposure:green_band_index = 234l ;
        exposure:blue_band_index = 500l ;
      double frameperiod ;
        frameperiod:long_name = "frameperiod" ;
      double speed ;
        speed:long_name = "speed" ;
      double startpos ;
        startpos:long_name = "startpos" ;
      double stoppos ;
        stoppos:long_name = "stoppos" ;
      double useexternaltrigger ;
        useexternaltrigger:long_name = "useexternaltrigger" ;
      double userotatingmirror ;
        userotatingmirror:long_name = "userotatingmirror" ;
  } // group /sensor_variable_metadata
  group: user_given_metadata {
  } // group /user_given_metadata
} // group /
yanliu-chn commented 8 years ago

ncks -m -M output:

netcdf 0596c17f-2e4c-4d43-9d77-cde8ffbde663 {
  dimensions:
    wavelength = 955 ;
    wvl_nvr = 1024 ;
    x = 1600 ;
    y = 468 ;
  variables:
    float flx_dwn(wavelength) ;
      flx_dwn:long_name = "Downwelling spectral irradiance" ;
      flx_dwn:standard_name = "surface_downwelling_radiative_flux_per_unit_wavelength_in_air" ;
      flx_dwn:units = "watt meter-2 meter-1" ;
    float flx_sns(wvl_nvr) ;
      flx_sns:long_name = "Flux sensitivity of each band (irradiance per count)" ;
      flx_sns:provenance = "EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434" ;
      flx_sns:units = "joule count-1" ;
    double frametime(y) ;
      frametime:units = "days since 1970-01-01 00:00:00" ;
      frametime:calender = "gregorian" ;
    float rfl_img(wavelength,y,x) ;
      rfl_img:long_name = "Reflectance of image" ;
      rfl_img:meaning = "Counts on scale from 0 to 2^16-1 = 65535" ;
      rfl_img:standard_name = "surface_albedo" ;
      rfl_img:units = "1" ;
    float rfl_wht(wavelength) ;
      rfl_wht:long_name = "Reflectance of white reference" ;
      rfl_wht:units = "1" ;
    double wavelength(wavelength) ;
      wavelength:long_name = "Hyperspectral Wavelength" ;
      wavelength:units = "meter" ;
      wavelength:standard_name = "radiation_wavelength" ;
    float wvl_dlt(wvl_nvr) ;
      wvl_dlt:long_name = "Bandwidth of environmental sensor" ;
      wvl_dlt:notes = "Bandwidth, also called dispersion, is between 0.455-0.495 nm across all channels. Values computed as differences between midpoints of adjacent band-centers." ;
      wvl_dlt:standard_name = "bandwidth" ;
      wvl_dlt:units = "meter" ;
    float wvl_nvr(wvl_nvr) ;
      wvl_nvr:long_name = "Wavelength of environmental sensor" ;
      wvl_nvr:provenance = "EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434" ;
      wvl_nvr:standard_name = "sensor_band_central_radiation_wavelength" ;
      wvl_nvr:units = "meter" ;
    double x(x) ;
      x:algorithm = "CSZ implemented these fake data to be replaced by real formula once available." ;
      x:long_name = "North-south offset from start position" ;
      x:units = "meter" ;
    ushort xps_drk(wavelength,x) ;
      xps_drk:long_name = "Exposure from dark reference sheet/panel" ;
      xps_drk:units = "Counts on scale from 0 to 2^16-1 = 65535" ;
    ushort xps_img(wavelength,y,x) ;
      xps_img:long_name = "Exposure counts" ;
      xps_img:meaning = "Counts on scale from 0 to 2^16-1 = 65535" ;
      xps_img:units = "1" ;
    ushort xps_wht(wavelength,x) ;
      xps_wht:long_name = "Exposure from white reference sheet/panel" ;
      xps_wht:units = "Counts on scale from 0 to 2^16-1 = 65535" ;
    double y(y) ;
      y:algorithm = "Based on https://github.com/terraref/computing-pipeline/issues/144. y is defined as 0.9853 mm per pixel. Exact number is 0.98526434004512529576754637665 mm." ;
      y:long_name = "East-west offset from start position" ;
      y:units = "meter" ;
  // global attributes:
    :title = "None given (supply with --trr ttl=\"Title\")" ;
    :created_by = "yanliu" ;
    :Conventions = "CF-1.5" ;
    :Project = "TERRAREF" ;
    :terraref_script = "terraref.sh" ;
    :terraref_hostname = "cg-gpu01" ;
    :terraref_version = "4.6.0" ;
    :history = "Thu Sep  1 11:09:33 2016: ncap2 -A -S /gpfs/largeblockFS/projects/arpae/sw/computing-pipeline/scripts/hyperspectral/terraref.nco /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp\n",
      "Thu Sep  1 11:09:31 2016: ncks -A /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp\n",
      "Thu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp" ;
    :NCO = "\"4.6.0\"" ;
    :history_of_appended_files = "Thu Sep  1 11:09:33 2016: Appended file /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp had following \"history\" attribute:\n",
      "Thu Sep  1 11:09:31 2016: ncks -A /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp\n",
      "Thu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp\n",
      "Thu Sep  1 11:09:31 2016: Appended file /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp had following \"history\" attribute:\n",
      "Thu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp\n" ;
    :nco_openmp_thread_number = 1 ;
  group: gantry_system_fixed_metadata {
    // group attributes:
      :gantry_fixed_data_2 = "Todo" ;
      :gantry_fixed_data_1 = "Todo" ;
      :System_manufacturer = "LemnaTec Corp." ;
  } // group /gantry_system_fixed_metadata
  group: gantry_system_variable_metadata {
    variables:
      double u ;
        u:long_name = "Gantry_Speed_in_X_Direction" ;
        u:units = "meter second-1" ;
      double v ;
        v:long_name = "Gantry_Speed_in_Y_Direction" ;
        v:units = "meter second-1" ;
      double w ;
        w:long_name = "Gantry_Speed_in_Z_Direction" ;
        w:units = "meter second-1" ;
      double x ;
        x:long_name = "Position_in_X_Direction" ;
        x:units = "meter" ;
      double y ;
        y:long_name = "Position_in_Y_Direction" ;
        y:units = "meter" ;
      double z ;
        z:long_name = "Position_in_Z_Direction" ;
        z:units = "meter" ;
    // group attributes:
      :Camnera_box_light_4_is_on = "True" ;
      :Position_in_\]_Direction = "0.97" ;
      :Camnera_box_light_2_is_on = "True" ;
      :Camnera_box_light_1_is_on = "True" ;
      :Gantry_Speed_in_\]_Direction = "0" ;
      :Time = "04/07/2016 16:15:45" ;
      :Camnera_box_light_3_is_on = "True" ;
  } // group /gantry_system_variable_metadata
  group: header_info {
    variables:
      double blue_band_index ;
      double green_band_index ;
      double red_band_index ;
    // group attributes:
      :HSIII_VERSION = "E51215 vs64" ;
      :POST_AOI_left = "0" ;
      :Col_binning = "1" ;
      :AOI_width = "1600" ;
      :Row_binning = "1" ;
      :FrameIndex = "frameIndex.txt" ;
      :AOI_height = "960" ;
      :header_offset = "0" ;
      :Lens_EFL = "17" ;
      :Serial_Number = "SN-G4-384" ;
      :samples = "1600" ;
      :byte_order = "0" ;
      :Lens_folder = "" ;
      :description = "{[HEADWALL Hyperspec III]}" ;
      :default_bands = "{140,234,500}" ;
      :bands = "955" ;
      :POST_Row_binning = "1" ;
      :POST_AOI_width = "1600" ;
      :file_type = "ENVI Standard" ;
      :Nuc_folder = "" ;
      :data_type = "12" ;
      :AverageDispersion = "0.63986398" ;
      :POST_Col_binning = "1" ;
      :Array_Pixel_Pitch = "6.5" ;
      :sensor_type = "Unknown" ;
      :POST_AOI_height = "955" ;
      :lines = "468" ;
      :interleave = "bil" ;
      :AOI_top = "600" ;
      :Pixel0 = "3.100546185" ;
      :AOI_left = "480" ;
      :POST_AOI_top = "5" ;
  } // group /header_info
  group: sensor_fixed_metadata {
    // group attributes:
      :sensor_serial_number = "Todo" ;
      :sensor_purpose = "Todo" ;
      :sensor_product_name = "VNIR" ;
      :sensor_description = "Todo" ;
      :sensor_manufacturer = "Headwall Scientific" ;
  } // group /sensor_fixed_metadata
  group: sensor_variable_metadata {
    variables:
      double constmirrorpos ;
        constmirrorpos:long_name = "constmirrorpos" ;
      double createdatacube ;
        createdatacube:long_name = "createdatacube" ;
      double exposure ;
        exposure:long_name = "exposure" ;
        exposure:red_band_index = 140l ;
        exposure:green_band_index = 234l ;
        exposure:blue_band_index = 500l ;
      double frameperiod ;
        frameperiod:long_name = "frameperiod" ;
      double speed ;
        speed:long_name = "speed" ;
      double startpos ;
        startpos:long_name = "startpos" ;
      double stoppos ;
        stoppos:long_name = "stoppos" ;
      double useexternaltrigger ;
        useexternaltrigger:long_name = "useexternaltrigger" ;
      double userotatingmirror ;
        userotatingmirror:long_name = "userotatingmirror" ;
    // group attributes:
      :exposure = "45" ;
      :startpos = "-70" ;
      :frameperiod = "50" ;
      :userotatingmirror = "0" ;
      :speed = "100" ;
      :useexternaltrigger = "0" ;
      :constmirrorpos = "0" ;
      :createdatacube = "0" ;
      :stoppos = "70" ;
  } // group /sensor_variable_metadata
  group: user_given_metadata {
    // group attributes:
      :first_wheat_test_by_Markus_Radermacher = "" ;
      :experiment_info_1 = "..." ;
      :and_so_on_and_so_on... = "..." ;
  } // group /user_given_metadata
} // group /
dlebauer commented 8 years ago

Some of this content is a direct dump from the metadata file provided with the raw data, and that does not need to be duplicated. I think the key new parts are dimensions, variables, and global attributes from ncks -m -M

Zodiase commented 8 years ago

@rachelshekar Could you explain the reason for changing the title while this issue is not specific to anything "hyperspectral"?

ghost commented 8 years ago

I didn't realize that it's a general extractor. I'll change it back.

yanliu-chn commented 8 years ago

Tried xml2json, it doesn't work since it converts what is supposed to be {"key1": "value1"} to {"@name": "key1", "@value": "value1"}

yanliu-chn commented 8 years ago

Tried ncdump-json. It's not trivial to set up. The experience is:

sed -i -e "s/USE_DAP/DO_NOT_USE_DAP/" src/ncdump.c
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "/home/yanliu/scratch/arpae/tmp/hyperspectral/output/ncdump-json-r13/cmake/")
find_package( NetCDF REQUIRED )
include_directories(${NETCDF_INCLUDES})
link_libraries(${NETCDF_LIBRARIES})
#find_package( PkgConfig REQUIRED )
#pkg_check_modules( Netcdf REQUIRED netcdf )
target_link_libraries(ncdump-json ${NETCDF_LIBRARIES})
cmake -D NETCDF_INCLUDES=$NETCDF4_HOME/include -D NETCDF_LIBRARIES="-L$NETCDF4_HOME/lib -lnetcdf" ..
./ncdump-json-r13/build/ncdump-json -h -j 0596c17f-2e4c-4d43-9d77-cde8ffbde663.nc

{"dimensions":{"wavelength":955,"x":1600,"y":468,"wvl_nvr":1024},"variables":{"xps_img":{"type":"ushort","dimensions":["wavelength","y","x"],"attributes":{"long_name":"Exposure counts","meaning":"Counts on scale from 0 to 2^16-1 = 65535","units":"1"}},"frametime":{"type":"double","dimensions":["y"],"attributes":{"units":"days since 1970-01-01 00:00:00","calender":"gregorian"}},"wavelength":{"type":"double","dimensions":["wavelength"],"attributes":{"long_name":"Hyperspectral Wavelength","units":"meter","standard_name":"radiation_wavelength"}},"flx_dwn":{"type":"float","dimensions":["wavelength"],"attributes":{"long_name":"Downwelling spectral irradiance","standard_name":"surface_downwelling_radiative_flux_per_unit_wavelength_in_air","units":"watt meter-2 meter-1"}},"flx_sns":{"type":"float","dimensions":["wvl_nvr"],"attributes":{"long_name":"Flux sensitivity of each band (irradiance per count)","provenance":"EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434","units":"joule count-1"}},"rfl_img":{"type":"float","dimensions":["wavelength","y","x"],"attributes":{"long_name":"Reflectance of image","meaning":"Counts on scale from 0 to 2^16-1 = 65535","standard_name":"surface_albedo","units":"1"}},"rfl_wht":{"type":"float","dimensions":["wavelength"],"attributes":{"long_name":"Reflectance of white reference","units":"1"}},"wvl_dlt":{"type":"float","dimensions":["wvl_nvr"],"attributes":{"long_name":"Bandwidth of environmental sensor","notes":"Bandwidth, also called dispersion, is between 0.455-0.495 nm across all channels. Values computed as differences between midpoints of adjacent band-centers.","standard_name":"bandwidth","units":"meter"}},"wvl_nvr":{"type":"float","dimensions":["wvl_nvr"],"attributes":{"long_name":"Wavelength of environmental sensor","provenance":"EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434","standard_name":"sensor_band_central_radiation_wavelength","units":"meter"}},"x":{"type":"double","dimensions":["x"],"attributes":{"algorithm":"CSZ implemented these fake data to be replaced by real formula once available.","long_name":"North-south offset from start position","units":"meter"}},"xps_drk":{"type":"ushort","dimensions":["wavelength","x"],"attributes":{"long_name":"Exposure from dark reference sheet/panel","units":"Counts on scale from 0 to 2^16-1 = 65535"}},"xps_wht":{"type":"ushort","dimensions":["wavelength","x"],"attributes":{"long_name":"Exposure from white reference sheet/panel","units":"Counts on scale from 0 to 2^16-1 = 65535"}},"y":{"type":"double","dimensions":["y"],"attributes":{"algorithm":"Based on https://github.com/terraref/computing-pipeline/issues/144. y is defined as 0.9853 mm per pixel. Exact number is 0.98526434004512529576754637665 mm.","long_name":"East-west offset from start position","units":"meter"}}},"global_attributes":{"title":"None given (supply with --trr ttl=\"Title\")","created_by":"yanliu","Conventions":"CF-1.5","Project":"TERRAREF","terraref_script":"terraref.sh","terraref_hostname":"cg-gpu01","terraref_version":"4.6.0","history":"Thu Sep  1 11:09:33 2016: ncap2 -A -S /gpfs/largeblockFS/projects/arpae/sw/computing-pipeline/scripts/hyperspectral/terraref.nco /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmpThu Sep  1 11:09:31 2016: ncks -A /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmpThu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp","NCO":"\"4.6.0\"","history_of_appended_files":"Thu Sep  1 11:09:33 2016: Appended file /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp had following \"history\" attribute:Thu Sep  1 11:09:31 2016: ncks -A /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmpThu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmpThu Sep  1 11:09:31 2016: Appended file /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp had following \"history\" attribute:Thu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp","nco_openmp_thread_number":1}
group: gantry_system_fixed_metadata {
,"group_attributes":{"gantry_fixed_data_2":"Todo","gantry_fixed_data_1":"Todo","System_manufacturer":"LemnaTec Corp."}} // group gantry_system_fixed_metadata

group: gantry_system_variable_metadata {
"variables":{"u":{"type":"double","dimensions":[],"attributes":{"long_name":"Gantry_Speed_in_X_Direction","units":"meter second-1"}},"v":{"type":"double","dimensions":[],"attributes":{"long_name":"Gantry_Speed_in_Y_Direction","units":"meter second-1"}},"w":{"type":"double","dimensions":[],"attributes":{"long_name":"Gantry_Speed_in_Z_Direction","units":"meter second-1"}},"x":{"type":"double","dimensions":[],"attributes":{"long_name":"Position_in_X_Direction","units":"meter"}},"y":{"type":"double","dimensions":[],"attributes":{"long_name":"Position_in_Y_Direction","units":"meter"}},"z":{"type":"double","dimensions":[],"attributes":{"long_name":"Position_in_Z_Direction","units":"meter"}}},"group_attributes":{"Camnera_box_light_4_is_on":"True","Position_in_\]_Direction":"0.97","Camnera_box_light_2_is_on":"True","Camnera_box_light_1_is_on":"True","Gantry_Speed_in_\]_Direction":"0","Time":"04/07/2016 16:15:45","Camnera_box_light_3_is_on":"True"}} // group gantry_system_variable_metadata

group: header_info {
"variables":{"blue_band_index":{"type":"double","dimensions":[],"attributes":{}},"green_band_index":{"type":"double","dimensions":[],"attributes":{}},"red_band_index":{"type":"double","dimensions":[],"attributes":{}}},"group_attributes":{"HSIII_VERSION":"E51215 vs64","POST_AOI_left":"0","Col_binning":"1","AOI_width":"1600","Row_binning":"1","FrameIndex":"frameIndex.txt","AOI_height":"960","header_offset":"0","Lens_EFL":"17","Serial_Number":"SN-G4-384","samples":"1600","byte_order":"0","Lens_folder":"","description":"{[HEADWALL Hyperspec III]}","default_bands":"{140,234,500}","bands":"955","POST_Row_binning":"1","POST_AOI_width":"1600","file_type":"ENVI Standard","Nuc_folder":"","data_type":"12","AverageDispersion":"0.63986398","POST_Col_binning":"1","Array_Pixel_Pitch":"6.5","sensor_type":"Unknown","POST_AOI_height":"955","lines":"468","interleave":"bil","AOI_top":"600","Pixel0":"3.100546185","AOI_left":"480","POST_AOI_top":"5"}} // group header_info

group: sensor_fixed_metadata {
,"group_attributes":{"sensor_serial_number":"Todo","sensor_purpose":"Todo","sensor_product_name":"VNIR","sensor_description":"Todo","sensor_manufacturer":"Headwall Scientific"}} // group sensor_fixed_metadata

group: sensor_variable_metadata {
"variables":{"constmirrorpos":{"type":"double","dimensions":[],"attributes":{"long_name":"constmirrorpos"}},"createdatacube":{"type":"double","dimensions":[],"attributes":{"long_name":"createdatacube"}},"exposure":{"type":"double","dimensions":[],"attributes":{"long_name":"exposure","red_band_index":140L,"green_band_index":234L,"blue_band_index":500L}},"frameperiod":{"type":"double","dimensions":[],"attributes":{"long_name":"frameperiod"}},"speed":{"type":"double","dimensions":[],"attributes":{"long_name":"speed"}},"startpos":{"type":"double","dimensions":[],"attributes":{"long_name":"startpos"}},"stoppos":{"type":"double","dimensions":[],"attributes":{"long_name":"stoppos"}},"useexternaltrigger":{"type":"double","dimensions":[],"attributes":{"long_name":"useexternaltrigger"}},"userotatingmirror":{"type":"double","dimensions":[],"attributes":{"long_name":"userotatingmirror"}}},"group_attributes":{"exposure":"45","startpos":"-70","frameperiod":"50","userotatingmirror":"0","speed":"100","useexternaltrigger":"0","constmirrorpos":"0","createdatacube":"0","stoppos":"70"}} // group sensor_variable_metadata

group: user_given_metadata {
,"group_attributes":{"first_wheat_test_by_Markus_Radermacher":"","experiment_info_1":"...","and_so_on_and_so_on...":"..."}} // group user_given_metadata
}
yanliu-chn commented 8 years ago

@robkooper @dlebauer @czender please review my comments above and suggest a better solution. And, before we get a good way to convert, we should output CDL.

yanliu-chn commented 8 years ago

@czender would it be a good feature to have in ncks --json?

czender commented 8 years ago

@yanliu-chn I don't know of an alternative to get JSON from netCDF. I do know that ncks emits correct CDL (with --cdl) and XML (with --xml). I tried to add a JSON backend to ncks then gave up about 3 years ago. It's certainly doable, though it cannot leverage much of the CDL/XML backend code structure because JSON is quite different. If there is truly no alternative to writing our own then we (me and @hmb1) look into that if it's important to Terraref. Hoping someone knows of an existing alternative to dump JSON from netCDF.

robkooper commented 8 years ago

since we will have python, maybe: http://stackoverflow.com/a/10201405

import xmltodict, json

o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')
json.dumps(o) # '{"e": {"a": ["text", "text"]}}'
yanliu-chn commented 8 years ago

i don't think xml2dict does things differently than xml2json. the issue here is to customize the parser to put node, attributes, value, namespaces into perspective.

gsrohde commented 8 years ago

For what it's worth, in BETYdb I'm using a JSON—XML mapping where the JSON version keeps most of the information of the XML so that theoretically, you could do a XML—>JSON—>XML round trip. I distinguish elements and attributes, but I don't think I take namespaces into consideration. Examples of corresponding documents are https://github.com/PecanProject/bety/blob/master/app/lib/api/test/TEST_JSON_DATA and https://github.com/PecanProject/bety/blob/master/app/lib/api/test/TEST_XML_DATA. (These may have gotten slightly out of sync, but they correspond closely enough that the mapping should be clear.) Note that the value for the "children" key is an array (because in XML element order is significant) but the value for the "attributes" key is a hash (because attribute order is not significant). Unfortunately, I've only written a json_2_xml function, not an xml_2_json function. And it's in Ruby.

yanliu-chn commented 8 years ago

@gsrohde @dlebauer so BETYdb is the receiver of this metadata?

Could you sketch the pipeline for me? Should I use a BETYdb API in the extractor to directly send the metadata to BETYdb, or should I send it as Clowder metadata of the netcdf file; then you have a way to extract the metadata of the Clowder dataset? If latter, what is the protocol you are using to know which metadata part you will extract, or I don't need to care?

Another question is: can you take XML directly since you need to store XML anyway?

gsrohde commented 8 years ago

No, I wasn't saying BETYdb would be part of this pipeline. I was only giving an example XML<—>JSON mapping (which just happens to be part of BETYdb).

For more perspective on various mappings of XML to JSON, you might look at https://github.com/bramstein/xsltjson, even if you aren't interested in doing the conversion with XSLT (though I've found xsltproc, part of the libxml2 library, to be very fast; but it only handles XSLT 1.0 I think).

yanliu-chn commented 8 years ago

I see. Thanks for the information!

Most of the xml2json libraries do the conversion based on http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html , which is not sufficient to convert the ncks output. At least the xml2json python one didn't do its job. Some mapping to consolidate attribute/element names in XML to correct json key-value is needed.

czender commented 8 years ago

if continued investigation shows no robust method to convert netCDF to JSON, and JSON is important to Terraref, then @hmb1 and I can discuss making this a medium-term goal.

yanliu-chn commented 8 years ago

I have finished the metadata extraction part, now outputing .xml and .cdl, instead of json until we find a good solution on generating json.

Pull request: https://github.com/terraref/computing-pipeline/pull/182

@max-zilla please review the request. This request is based on Xingchen's previous pull request on the hyperspectral_extractor branch.

dlebauer commented 8 years ago

@czender and @yanliu-chn here is an hdf5-json example that doesn't seem to have the problem of xml2json mentioned in https://github.com/terraref/computing-pipeline/issues/145#issuecomment-253075024

yanliu-chn commented 8 years ago

@dlebauer @czender tried hdf5-json. I tried both the output of TerraRef.sh and example netcdf4 examples at http://www.unidata.ucar.edu/software/netcdf/examples/files.html .

Here is what I got:

Any suggestions?

Output from examples are appended here:

extractor output nc:

(hdf5-json)[yanliu@cg-gpu01 output]$ python ./hdf5-json/bin/h5tojson.py 0596c17f-2e4c-4d43-9d77-cde8ffbde663.nc
Traceback (most recent call last):
  File "./hdf5-json/bin/h5tojson.py", line 247, in <module>
    main()
  File "./hdf5-json/bin/h5tojson.py", line 244, in main
    dumper.dumpFile()
  File "./hdf5-json/bin/h5tojson.py", line 198, in dumpFile
    self.dumpGroups()
  File "./hdf5-json/bin/h5tojson.py", line 101, in dumpGroups
    item = self.dumpGroup(self.root_uuid)
  File "./hdf5-json/bin/h5tojson.py", line 90, in dumpGroup
    attributes = self.dumpAttributes('groups', uuid)
  File "./hdf5-json/bin/h5tojson.py", line 59, in dumpAttributes
    item = self.dumpAttribute(col_name, uuid, attr['name'])
  File "./hdf5-json/bin/h5tojson.py", line 41, in dumpAttribute
    item = self.db.getAttributeItem(col_name, uuid, attr_name)
  File "/gpfs/largeblockFS/scratch/arpae/tmp/hyperspectral/output/hdf5-json/lib/python2.7/site-packages/h5json/hdf5db.py", line 1236, in getAttributeItem
    item = self.getAttributeItemByObj(obj, name)
  File "/gpfs/largeblockFS/scratch/arpae/tmp/hyperspectral/output/hdf5-json/lib/python2.7/site-packages/h5json/hdf5db.py", line 1170, in getAttributeItemByObj
    attr = obj.attrs[name]  # returns a numpy array
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-3TLvcq-build/h5py/_objects.c:2684)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-3TLvcq-build/h5py/_objects.c:2642)
  File "/gpfs/largeblockFS/scratch/arpae/tmp/hyperspectral/output/hdf5-json/lib/python2.7/site-packages/h5py/_hl/attrs.py", line 79, in __getitem__
    attr.read(arr, mtype=htype)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-3TLvcq-build/h5py/_objects.c:2684)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-3TLvcq-build/h5py/_objects.c:2642)
  File "h5py/h5a.pyx", line 355, in h5py.h5a.AttrID.read (/tmp/pip-3TLvcq-build/h5py/h5a.c:5152)
  File "h5py/_proxy.pyx", line 36, in h5py._proxy.attr_rw (/tmp/pip-3TLvcq-build/h5py/_proxy.c:1003)
IOError: Unable to read attribute (No appropriate function for conversion path)

test_hgroups.nc on netcdf website:

(hdf5-json)[yanliu@cg-gpu01 output]$ python ./hdf5-json/bin/h5tojson.py test_hgroups.nc
Segmentation fault (core dumped)

OMI-Aura_L2-example.nc on netcdf website: created json output. Attached here.

OMI-Aura_L2-example.json.txt

czender commented 8 years ago

@yanliu-chn @hmb1 @dlebauer Not surprised anymore that nc->json converters break. I tried a few, and Henry tried a promising solution mentioned on netCDF group, where the consensus was also that no robust tool exists. Those that are robust will only work on flat files or single groups. None of them will do the job. Terraref netCDF files are not too complex, yet not completely simple either. Your experience is one more piece of evidence that there is not yet any good solution. Henry is looking into what it would take to implement a netCDF metadata -> JSON backend in NCO.

dlebauer commented 8 years ago

@tedhabermann / @jreadey any thoughts on issues in https://github.com/terraref/computing-pipeline/issues/145#issuecomment-254581578 re: hdf5-json or json dumping utilities in general? Would it help to post an issue there?

jreadey commented 8 years ago

@dlebauer, I could take a look at some of the files on the netcdf site. If you could post issues to https://github.com/HDFGroup/hdf5-json, that would be helpful.

There are a few issues already cataloged that might be relevant:

  1. Support for opaque types: https://github.com/HDFGroup/hdf5-json/issues/55
  2. Failures related to VLEN strings: https://github.com/HDFGroup/hdf5-json/issues/19
  3. Support for large files (may not be relevant if just metadata is needed): https://github.com/HDFGroup/hdf5-json/issues/31
  4. Dimension scale issues: https://github.com/HDFGroup/hdf5-json/issues/37
max-zilla commented 8 years ago

@yanliu-chn just merged your pull request.

If I understand correctly, we basically want this portion of the hyperspectral code:

print 'extracting metadata in cdl format'
 +          metaFilePath = outFilePath + '.cdl'
 +          with open(metaFilePath, 'w') as fmeta:
 +              subprocess.call('ncks', '--cdl', '-m', '-M', outFilePath], stdout=fmeta)
 +          if os.path.exists(metaFilePath):
 +              extractors.upload_file_to_dataset(filepath=metaFilePath, parameters=parameters)
 +
 +          print 'extracting metadata in xml format'
 +          metaFilePath = outFilePath + '.xml'
 +          with open(metaFilePath, 'w') as fmeta:
 +              subprocess.call('ncks', '--xml', '-m', '-M', outFilePath], stdout=fmeta)
 +          if os.path.exists(metaFilePath):
 +              extractors.upload_file_to_dataset(filepath=metaFilePath, parameters=parameters)

...to go into a separate extractor, since the hyperspectral extractor is not the only one generating .nc files. I think we want to write an extractor that triggers on any .nc file in Clowder and calls

ncks -xml ...
ncks -cdl ...

Have we solved the XML -> JSON or CDL -> JSON piece, or not yet?

yanliu-chn commented 8 years ago

@max-zilla If this is the case, I think we should wrap this as a function for other extractors to call.

ToJSON function is being considered by Charlie and his team. We don't have a good solution so far.

max-zilla commented 8 years ago

@yanliu-chn I don't think we need the extractors to call this function - the function just creates XML and CDL metadata files from any netCDF file. we set this up to trigger on netCDF files being added to clowder, and that way it can process both .nc files from extractors and .nc files that people upload manually.

max-zilla commented 8 years ago

@yanliu-chn https://github.com/terraref/extractors-metadata/tree/master/netcdf FYI.

yanliu-chn commented 8 years ago

@max-zilla cool! you already got it done.

czender commented 8 years ago

@yanliu-chn @max-zilla The netCDF->JSON converter is almost working. It is invoked with ncks --jsn, analogous to the NCO CDL and XML converters. @hmb1 will post when it is in-tree, and then NCO will need to be re-built from latest source to give the extractor access to this new functionality.

yanliu-chn commented 8 years ago

great! let me know when it is available and i will install it on ROGER.

max-zilla commented 8 years ago

@jdmaloney I have this extractor built as it currently is, and a VM prepared on Roger. Can you create these exports:

IP - 141.142.170.192

READS

/sites/ua-mac/

I think this needs read access to both raw_data and Level_1, because it will extract metadata from both raw netCDF files (if there even are any?) as well as extractor outputs like the environmentlogger extractor.

WRITES

/sites/ua-mac/Level_1/netcdf

This will create subdirectories under Level_1/netcdf with sensor names like stereoTop and EnvironmentLogger.

max-zilla commented 8 years ago

@yanliu-chn the new NCO is ready to install on roger.

CZ: JSON was added to new release NCO 4.6.2-beta01 10 minutes ago:
zender@firn:~$ ncks --json -v one ~/nco/data/in.nc
{
    "one": {
      "type": "float",
      "long_name": "one",
      "data": 1.0
    }
}
yanliu-chn commented 8 years ago

I have deployed nco-4.6.2-beta01 on ROGER. The test works!!! Thank you, @czender !

Here is how to use it. Please see if the json output looks good.

module purge
module load gdal-stack-2.7.10 nco # i changed default nco version to 4.6.2-beta01
ncks --jsn -m -M 0596c17f-2e4c-4d43-9d77-cde8ffbde663.nc
{
  "dimensions": {
    "wavelength": 955,
    "wvl_nvr": 1024,
    "x": 1600,
    "y": 468
    },
    "flx_dwn": {
      "dims": ["wavelength"],
      "type": "float",
      "long_name": "Downwelling spectral irradiance",
      "standard_name": "surface_downwelling_radiative_flux_per_unit_wavelength_in_air",
      "units": "watt meter-2 meter-1"
    },
    "flx_sns": {
      "dims": ["wvl_nvr"],
      "type": "float",
      "long_name": "Flux sensitivity of each band (irradiance per count)",
      "provenance": "EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434",
      "units": "joule count-1"
    },
    "frametime": {
      "dims": ["y"],
      "type": "double",
      "units": "days since 1970-01-01 00:00:00",
      "calender": "gregorian"
    },
    "rfl_img": {
      "dims": ["wavelength","y","x"],
      "type": "float",
      "long_name": "Reflectance of image",
      "meaning": "Counts on scale from 0 to 2^16-1 = 65535",
      "standard_name": "surface_albedo",
      "units": "1"
    },
    "rfl_wht": {
      "dims": ["wavelength"],
      "type": "float",
      "long_name": "Reflectance of white reference",
      "units": "1"
    },
    "wavelength": {
      "dims": ["wavelength"],
      "type": "double",
      "long_name": "Hyperspectral Wavelength",
      "units": "meter",
      "standard_name": "radiation_wavelength"
    },
    "wvl_dlt": {
      "dims": ["wvl_nvr"],
      "type": "float",
      "long_name": "Bandwidth of environmental sensor",
      "notes": "Bandwidth, also called dispersion, is between 0.455-0.495 nm across all channels. Values computed as differences between midpoints of adjacent band-centers.",
      "standard_name": "bandwidth",
      "units": "meter"
    },
    "wvl_nvr": {
      "dims": ["wvl_nvr"],
      "type": "float",
      "long_name": "Wavelength of environmental sensor",
      "provenance": "EnvironmentalLogger calibration information from file S05673_08062015.IrradCal provided by TinoDornbusch and discussed here: https://github.com/terraref/reference-data/issues/30#issuecomment-217518434",
      "standard_name": "sensor_band_central_radiation_wavelength",
      "units": "meter"
    },
    "x": {
      "dims": ["x"],
      "type": "double",
      "algorithm": "CSZ implemented these fake data to be replaced by real formula once available.",
      "long_name": "North-south offset from start position",
      "units": "meter"
    },
    "xps_drk": {
      "dims": ["wavelength","x"],
      "type": "short",
      "long_name": "Exposure from dark reference sheet/panel",
      "units": "Counts on scale from 0 to 2^16-1 = 65535"
    },
    "xps_img": {
      "dims": ["wavelength","y","x"],
      "type": "short",
      "long_name": "Exposure counts",
      "meaning": "Counts on scale from 0 to 2^16-1 = 65535",
      "units": "1"
    },
    "xps_wht": {
      "dims": ["wavelength","x"],
      "type": "short",
      "long_name": "Exposure from white reference sheet/panel",
      "units": "Counts on scale from 0 to 2^16-1 = 65535"
    },
    "y": {
      "dims": ["y"],
      "type": "double",
      "algorithm": "Based on https://github.com/terraref/computing-pipeline/issues/144. y is defined as 0.9853 mm per pixel. Exact number is 0.98526434004512529576754637665 mm.",
      "long_name": "East-west offset from start position",
      "units": "meter"
    },
    "attrs": {
      "title": "None given (supply with --trr ttl=\"Title\")",
      "created_by": "yanliu",
      "Conventions": "CF-1.5",
      "Project": "TERRAREF",
      "terraref_script": "terraref.sh",
      "terraref_hostname": "cg-gpu01",
      "terraref_version": "4.6.0",
      "history": "Thu Sep  1 11:09:33 2016: ncap2 -A -S /gpfs/largeblockFS/projects/arpae/sw/computing-pipeline/scripts/hyperspectral/terraref.nco /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp\nThu Sep  1 11:09:31 2016: ncks -A /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp\nThu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp",
      "NCO": "\"4.6.0\"",
      "history_of_appended_files": "Thu Sep  1 11:09:33 2016: Appended file /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp had following \"history\" attribute:\nThu Sep  1 11:09:31 2016: ncks -A /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_att.nc.pid44592.fl00.tmp\nThu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp\nThu Sep  1 11:09:31 2016: Appended file /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp had following \"history\" attribute:\nThu Sep 01 11:09:30 2016: python input/0596c17f-2e4c-4d43-9d77-cde8ffbde663_raw /gpfs_scratch/arpae/imaging_spectrometer/terraref_tmp_jsn.nc.pid44592.fl00.tmp\n",
      "nco_openmp_thread_number": 1
    },
    "groups": {
    "gantry_system_fixed_metadata": {
      "attrs": {
        "gantry_fixed_data_2": "Todo",
        "gantry_fixed_data_1": "Todo",
        "System_manufacturer": "LemnaTec Corp."
      }
      },
    "gantry_system_variable_metadata": {
      "u": {
        "type": "double",
        "long_name": "Gantry_Speed_in_X_Direction",
        "units": "meter second-1"
      },
      "v": {
        "type": "double",
        "long_name": "Gantry_Speed_in_Y_Direction",
        "units": "meter second-1"
      },
      "w": {
        "type": "double",
        "long_name": "Gantry_Speed_in_Z_Direction",
        "units": "meter second-1"
      },
      "x": {
        "type": "double",
        "long_name": "Position_in_X_Direction",
        "units": "meter"
      },
      "y": {
        "type": "double",
        "long_name": "Position_in_Y_Direction",
        "units": "meter"
      },
      "z": {
        "type": "double",
        "long_name": "Position_in_Z_Direction",
        "units": "meter"
      },
      "attrs": {
        "Camnera_box_light_4_is_on": "True",
        "Position_in_]_Direction": "0.97",
        "Camnera_box_light_2_is_on": "True",
        "Camnera_box_light_1_is_on": "True",
        "Gantry_Speed_in_]_Direction": "0",
        "Time": "04/07/2016 16:15:45",
        "Camnera_box_light_3_is_on": "True"
      }
      },
    "header_info": {
      "blue_band_index": {
        "type": "double"
      },
      "green_band_index": {
        "type": "double"
      },
      "red_band_index": {
        "type": "double"
      },
      "attrs": {
        "HSIII_VERSION": "E51215 vs64",
        "POST_AOI_left": "0",
        "Col_binning": "1",
        "AOI_width": "1600",
        "Row_binning": "1",
        "FrameIndex": "frameIndex.txt",
        "AOI_height": "960",
        "header_offset": "0",
        "Lens_EFL": "17",
        "Serial_Number": "SN-G4-384",
        "samples": "1600",
        "byte_order": "0",
        "Lens_folder": "",
        "description": "{[HEADWALL Hyperspec III]}",
        "default_bands": "{140,234,500}",
        "bands": "955",
        "POST_Row_binning": "1",
        "POST_AOI_width": "1600",
        "file_type": "ENVI Standard",
        "Nuc_folder": "",
        "data_type": "12",
        "AverageDispersion": "0.63986398",
        "POST_Col_binning": "1",
        "Array_Pixel_Pitch": "6.5",
        "sensor_type": "Unknown",
        "POST_AOI_height": "955",
        "lines": "468",
        "interleave": "bil",
        "AOI_top": "600",
        "Pixel0": "3.100546185",
        "AOI_left": "480",
        "POST_AOI_top": "5"
      }
      },
    "sensor_fixed_metadata": {
      "attrs": {
        "sensor_serial_number": "Todo",
        "sensor_purpose": "Todo",
        "sensor_product_name": "VNIR",
        "sensor_description": "Todo",
        "sensor_manufacturer": "Headwall Scientific"
      }
      },
    "sensor_variable_metadata": {
      "constmirrorpos": {
        "type": "double",
        "long_name": "constmirrorpos"
      },
      "createdatacube": {
        "type": "double",
        "long_name": "createdatacube"
      },
      "exposure": {
        "type": "double",
        "long_name": "exposure",
        "red_band_index": 140,
        "green_band_index": 234,
        "blue_band_index": 500
      },
      "frameperiod": {
        "type": "double",
        "long_name": "frameperiod"
      },
      "speed": {
        "type": "double",
        "long_name": "speed"
      },
      "startpos": {
        "type": "double",
        "long_name": "startpos"
      },
      "stoppos": {
        "type": "double",
        "long_name": "stoppos"
      },
      "useexternaltrigger": {
        "type": "double",
        "long_name": "useexternaltrigger"
      },
      "userotatingmirror": {
        "type": "double",
        "long_name": "userotatingmirror"
      },
      "attrs": {
        "exposure": "45",
        "startpos": "-70",
        "frameperiod": "50",
        "userotatingmirror": "0",
        "speed": "100",
        "useexternaltrigger": "0",
        "constmirrorpos": "0",
        "createdatacube": "0",
        "stoppos": "70"
      }
      },
    "user_given_metadata": {
      "attrs": {
        "first_wheat_test_by_Markus_Radermacher": "",
        "experiment_info_1": "...",
        "and_so_on_and_so_on...": "..."
      }
      }
    }

}
yanliu-chn commented 8 years ago

@max-zilla To make change to dockerfile, here is the change you can refer to to replace the current nco build part:

RUN cd /srv/downloads && \
    wget -q https://github.com/nco/nco/archive/4.6.2-beta01.tar.gz -O nco-4.6.2-beta01.tar.gz && \
    tar xfz nco-4.6.2-beta01.tar.gz && \
    cd nco-4.6.2-beta01 && \
    ./configure NETCDF_ROOT=/srv/sw/netcdf-4.4.1 --prefix=/srv/sw/nco-4.6.2-beta01 --enable-ncap2 --enable-udunits2 && \
    make && make install
ENV PATH="/srv/sw/nco-4.6.2-beta01/bin:${PATH}"
ENV LD_LIBRARY_PATH="/srv/sw/nco-4.6.2-beta01/lib:${LD_LIBRARY_PATH}"
robkooper commented 8 years ago

maybe make a NCO_VERSION variable so we can easily update it.

yanliu-chn commented 8 years ago

good idea.