IEA-Task-43 / digital_wra_data_standard

IEA Task 43: pre-construction energy estimate data standard repository
BSD 3-Clause "New" or "Revised" License
56 stars 15 forks source link

Adapt schema for remote sensing devices #36

Closed stephenholleran closed 3 years ago

stephenholleran commented 3 years ago

From @kersting extracted from Slack

All, I looked at some of the properties that we're capturing for remote sensing. One could use some of the properties as if they were loggers but I believe that they would be clear if we had them as their own entities. We can either start our discussion here or wait until our next meeting. Remote sensing technology (sodar, lidar, flidar) Remote sensing vendor Remote sensing model Remote sensor serial number Ownership (Rental or owned units) Special technology (Such as Flow Complexity Recognition) Power pack (Solar panel, fuel cell)

Also

I would like to share a few properties that we have in our database and it may be interesting to include in the current data schema. Project status (whether a project is under development or operation). Peer review status (This is flag that can be used to track whether a new entry in the database has been approved by someone. This is important for assuring that there has been some sort of data validation along the data entry process)

stephenholleran commented 3 years ago

@sdsmdp comment:

Thank you for sharing your thoughts, @kersting. I wasn't able to join our call y'day, but Leon was and we talked about this yesterday afternoon. We're glad to add support for remote sensing, as well. (I also just realized we never invite Leon to join this Slack channel, so I've done so). Regarding the project and review status properties: the schema includes a few fields flagged as being relevant to a database, but probably not passed around in JSON files. I see these status fields you have suggested as being like those. That is, I can see the value of those attributes in your database. However, I don't think the values would be passed around in the JSON with time series data. Do you agree, or do you see it differently?

stephenholleran commented 3 years ago

Notes from todays (12th Nov) call attended by : @stephenholleran, @kersting @joejoeyjoseph @sdsmdp @abohara

  1. Remote sensing technology (sodar, lidar, flidar) is the same as measurement_station_type.

    • [x] Action: Add flidar to list.  
  2. Remote sensor serial number, model and oem can be mapped in logger_main_config.

    • [x] Action: Edit description to account for these.  
  3. Project status, Power pack and Ownership are outside of scope of meta data needed for an analyst to process wind resource data.

    Action: None.

  4. Peer review is a much bigger question and would depend on individual organizations processes. A good UI would help with validation and so this will be revisited in the future.

    Action: None.

  5. Special technology (Such as Flow Complexity Recognition); this is a specific processing of data on the remote sensing device itself that some lidars perform. As this impacts on the actual timeseries data it should be captured. It was decided not to add it to logger_main_config but to create a child of it that would contain lidar_config. Seperating it out keeps the logger_main_config concise and allows for further expansion in lidar_config in the future.

    • [x] Action: Create lidar_config child from logger_main_config.  
  6. lidar_window_height_m was mentioned as another attribute to consider. Unfortunately we ran out of time to discuss so will continue the discussion over Slack or here.

    Action: Continue discussion.

Edit: formatting

stephenholleran commented 3 years ago

I have added: "Leosphere", "ZX Lidars", "AXYS Technologies", "AQSystem",

to the below list of manufacturers.

"enum": [
                    "NRG Systems",
                    "Ammonit",
                    "Campbell Scientific",
                    "Vaisala",
                    "SecondWind",
                    "Kintech",
                    "Wilmers",
                    "Unidata",
                    "WindLogger",
                    "Leosphere",
                    "ZX Lidars",
                    "AXYS Technologies",
                    "AQSystem",
                    "Other"
                  ]

Can anyone think of more?

kersting commented 3 years ago

@stephenholleran given that Vaisala has acquired Leopshere, wouldn't that be redundant?

stephenholleran commented 3 years ago

@stephenholleran given that Vaisala has acquired Leopshere, wouldn't that be redundant?

Yep, you are right but I put it in there anyway. People still refer to it as Leosphere and SecondWind is in the same boat. It also might be an old lidar when it was just Leopsphere so gives people options.

stephenholleran commented 3 years ago

@kersting @joejoeyjoseph @sdsmdp @abohara

  1. lidar_window_height_m was mentioned as another attribute to consider. Unfortunately we ran out of time to discuss so will continue the discussion over Slack or here.

Continuing this discussion. I don't think this attribute should be included in the new lidar_config table as that is more to do with how the logger of the device is configured. This lidar_window_height_m is more like a mast property or how the device is installed on site. Therefore I am unfortunately thinking of creating a new table called lidar_properties. See the spreadsheet in Google Docs or a screen shot below of what I am thinking. This could expand to capture device orientation or others?

image

Edit: Or we don't have to include it at all?

kersting commented 3 years ago

@stephenholleran @joejoeyjoseph @sdsmdp @abohara it makes sense for me. Also, I'd like to add a couple of things. I'm glad you're thinking about device orientation because that is very important. In addition, instead of using the label lidar_properties for this table, we may want to use vertical_lidar_properties because in a nacelle mounted lidar the properties are different.

stephenholleran commented 3 years ago

instead of using the label lidar_properties for this table, we may want to use vertical_lidar_properties because in a nacelle mounted lidar the properties are different.

Good point. This is getting more complicated than we can include for version 0.1. OK, in the same vane as the mast properties we will have a table called vertical_profiling_lidar_properties which will describe how the device is physically installed on site. Attributes it would contain are:

Consequence of this is that we should relabel the enums in measurement_station_type to be _vertical_profilinglidar instead of just lidar which is fine.

We are going to save nacelle mounted lidars and scanning lidars for future versions of the Data Model.

Questions:

  1. Can anyone foresee any issues with this approach?
  2. Are there any differences between a vertical profiling lidar and a vertical profiling sodar? Can we use the same table for both?
  3. Any other properties to be captured? I have thought of tilt but I don't think they are important to capture. If the device isn't level you've got other problems.
  4. Can anyone think of more sodar or lidar manufacturers that are used in the wind industry?
abohara commented 3 years ago

@stephenholleran @kersting

1 & 2. For now, a single table for possible fields related to lidar (all types) or sodar seems reasonable for now. This would be analogous to how we have one mast table for whether the mast is tiltup tubular or lattice. Since, every field does not have to be filled , the table can accumulate all the fields, and we can break them down into appropriate tables once we have accumulated them.

  1. Pentaluum is one I don't see in the list ( I believe recently acquired by NRG )
AndyClifton commented 3 years ago

Hi folks, as commented already in slack - we're trying to generate an ontology / metadata schema within IEA Wind Task 32 on wind lidar: https://github.com/IEA-Wind-Task-32/wind-lidar-glossary. Rather than Task 43 generating their own schema for resource assessment, which is one of many use cases, would you be willing to contribute instead to the Task 32 approach? We have other efforts we can leverage as well, such as e-WindLidar and the Campaign Planning Tool that have already invested a lot of time in this (@niva83)

stephenholleran commented 3 years ago

From today's call: @kersting is to reach out to Vaisala about from what part of the sodar device is the wind speed height measured from? That is, what is the equivalent to the window on a lidar.

kersting commented 3 years ago

@stephenholleran at least for a Vaisala sodar the base is where the measurements are located so when their sodar measures at 80m, it is actually 80m since the instrumentation is on the bottom of the device. If one installs the sodar in a 1m platform then the measurements would be 81m.

stephenholleran commented 3 years ago

Hi @AndyClifton, @kersting, @abohara and anyone else,

Thanks again Andy for joining our call yesterday.

Based on the discussions we are going to pick out the remote sensing specific terms and link them to the https://data.windenergy.dtu.dk/ontologies/view/IEATask32Glossary/en/ ontology where these terms are already defined. This will help with future collaboration when we all have the same understanding of the terms.

First up, some physical installation properties. We have a table (as in a relational database table) called mast_properties which cover some physical aspects of the mast structure itself like it's physical installed height. We have created a table called vertical_profiler_properties in this same vane to capture the terms listed in the below table. We are just focusing on vertical profiler lidars or sodars as these are more common in pre-construction wind resource assessments. Scanning lidars and nacelle mounted lidars can have their own table(s) in the future.

Term Definition
window_height_m "The window height of the lidar device. This is usually the height of window above ground level however it may be above sea level or above a platform level."
device_base_height_m "The height of the base of the remote sensing device e.g. above ground level. This is usually the height above ground level at which the remote sensing device is mounted however it may be above sea level or above a platform level."
height_reference_id "The height reference frame that is used to measure the height of the window. E.g. onshore this is ground level i.e. the window is 0.5 m above ground level. Offshore is a bit different as it can be 20 m above mean sea level or 20 m above lowest astronomical tide."
device_orientation_deg "The orientation that the remote sensing device is installed relative to north."
orientation_reference_id "The orientation reference the remote sensing device is measured against. E.g. magnetic north."

Questions and notes so far:

  1. Is vertical_profiler the appropriate term to refer to these type of lidars and sodars?

  2. As discussed on the call yesterday window_height_m for lidars and device_base_height_m for sodars can be combined and called device_datum_plane_height_m?

  3. Is device_orientation_deg an appropriate term to refer to how the device is orientated to North?

nikokaoja commented 3 years ago

@stephenholleran, be aware that due to a temporal issue with SSL certificates, the ontology is accessible on HTTP: http://data.windenergy.dtu.dk/ontologies/view/IEATask32Glossary/en/

I will ping you all when it becomes accessible on HTTPS

stephenholleran commented 3 years ago

Second is lidar specific configuration (programming) of the device.

We have a table called logger_main_config where "This represents how the logger's main settings are configured. For example, it's sampling rate or averaging period. For remote sensing devices, such as lidar's, the device itself is considered as a logger and so these logger configuration attributes should be used to describe the lidar." It also includes the OEM, model and serial number which we think can cover the lidar equivalent.

We have created a table called lidar_config which is linked from the logger_main_config table. This is to capture any lidar specific configuration or programming. It only contains 1 term.

Term Definition
flow_corrections_applied "Is there any flow corrections applied to the measured data by the lidar unit, e.g. FCR for WindCubes?"

Questions and notes so far:

  1. Is flow_corrections an appropriate term? The full term we use indicates that it is a Boolean as it is either on or off.

The poor relational diagram in Google Sheets might help linking all of these together.

AndyClifton commented 3 years ago

Folks - some definitions that might help:

  1. Measurement height: http://data.windenergy.dtu.dk/controlled-terminology/IEAWindTask32/parameters.measurement_height.
  2. Datum plane, feature, and elevation: http://data.windenergy.dtu.dk/ontologies/view/IEATask32Glossary/en/page/parameters.
AndyClifton commented 3 years ago

Re flow corrections. We avoid using this term as much as possible as it implies that the lidar/sodar is "wrong" in complex flow. This is incorrect as both masts and lidars/sodars have challenges!

It would be helpful to use terminology like transfer methods (§2.3 of https://zenodo.org/record/3862384), although we've also seen recently that e.g. CFARS likes correction techniques and flow correction too (see e.g. https://zenodo.org/record/4302363). We'll probably use transfer methods as the preferred label, with correction techniques and flow correction as the alternates. That way both can be tracked.

Anyway, that would give you flow_corrections_applied as something you could record and it would have some relatable meaning.

stephenholleran commented 3 years ago

Folks - some definitions that might help:

  1. Measurement height: http://data.windenergy.dtu.dk/controlled-terminology/IEAWindTask32/parameters.measurement_height.
  2. Datum plane, feature, and elevation: http://data.windenergy.dtu.dk/ontologies/view/IEATask32Glossary/en/page/parameters.

Hi @AndyClifton this is great, thanks. Not sure how I didn't find these when I was looking previously.

I think it makes sense that we use datum_plane_height_m to cover both window_height_m and device_base_height_m that we currently have? I see that you have datum elevation but we don't use elevation taken from a device or installation report for either lidars or met masts. When we are doing the flow modelling the elevation comes from the terrain data we inputted into the model. We always check that it somewhat matches but we never use elevation directly. Also, measuring elevation isn't that accurate from some GPS devices and can cause confusion. We are more interested in how high the datum plane is above ground level (or sea level for offshore).

The measurement height definition is good. This maps to our height_m within the sensor_config i.e. it is the height that is programmed into the logger. I will add more to our definition and link to this.

Just a note on your own definitions: Your datum feature definition and example combined are great. This explains it very clearly. I would have thought that the datum plane would refer back to this but there is no mention of datum feature? E.g. it is the horizontal plane that passes through the datum feature and from which the measurement height is defined.

nikokaoja commented 3 years ago

@stephenholleran ideally you would aim at decoupling your complex concepts such datum_plane_height_m (you are mixing label of concept with the unit for the concept, etc.) to atomic but semantically self-described components, and interconnecting them as we are doing in Controlled Vocabulary of Wind Energy Parameters and IEA Task 32 ontology.

For example, you might consider having the concept datum_plane_height, which has property prefUnit set to m. This is how a similar thing was done for the concept wind_speed in Controlled Vocabulary of Wind Energy Parameters:

http://data.windenergy.dtu.dk/controlled-terminology/wind-energy-parameters/wind_speed

The above two controlled terminologies are provided as FAIR and machine-actionable semantic artifacts (i.e., we serve them to humans and machines). Each concept and their properties (such as for example prefUnit) have GUPRI (globally unique persistent and resolvable identifier), which means if you access them as humans, you get a web page, if you are a computer agent you get a machine-actionable representation of that resource (e.g., TTL, JSON-LD, XML, etc.). Try curl in the terminal to see what an agent can get:

curl -L -v -X GET -H "Accept: text/turtle" "http://data.windenergy.dtu.dk/controlled-terminology/IEAWindTask32/instances"

To achieve the above we have setup OntoStack at DTU web server, which contains a number of interfaced microservices specially tailored for building, maintaining, and serving machine-actionable controlled terminologies (vocabularies, taxonomies, ontologies, etc.).

Similar to Task 32, you should consider making use of OntoStack. Currently, we are spinning it up at the national level (Denmark). If you chose to go in this direction then we can create so-called semantic rings, where concepts from the IEA Task 32 ontology or other ontologies points to yours, and you do vice versa.

You can see how we do this in the wind_speed example by finding property skos:exactMatch, which is rendered at the web site as EXACTLY MATCHING CONCEPTS. In this example we are pointing back to CF Standard Name hosted by mmisw.org.

stephenholleran commented 3 years ago

@stephenholleran ideally you would aim at decoupling your complex concepts such datum_plane_height_m (you are mixing label of concept with the unit for the concept, etc.) to atomic but semantically self-described components, and interconnecting them as we are doing in Controlled Vocabulary of Wind Energy Parameters and IEA Task 32 ontology.

For example, you might consider having the concept datum_plane_height, which has property prefUnit set to m. This is how a similar thing was done for the concept wind_speed in Controlled Vocabulary of Wind Energy Parameters:

Hi @niva83, thanks for your feedback. You are absolutely right in decoupling the concept and the unit. It is the proper thing to do. However, we decided to force users to convert their data to a standard unit. Allowing a user to specify their own units would mean we would need an extra field coupled with every concept. There are already quite a lot of fields for a user to fill in that adding more wasn't desirable. We would also have to cover off all possible options, e.g. for height we would have to have options for m, mm, cm, km, inches, feet, yards, etc. We felt that this was too much of a headache to deal with and also to deal with on the coding side when building validation and automation tools.

The above two controlled terminologies are provided as FAIR and machine-actionable semantic artifacts (i.e., we serve them to humans and machines). Each concept and their properties (such as for example prefUnit) have GUPRI (globally unique persistent and resolvable identifier), which means if you access them as humans, you get a web page, if you are a computer agent you get a machine-actionable representation of that resource (e.g., TTL, JSON-LD, XML, etc.). Try curl in the terminal to see what an agent can get:

curl -L -v -X GET -H "Accept: text/turtle" "http://data.windenergy.dtu.dk/controlled-terminology/IEAWindTask32/instances"

To achieve the above we have setup OntoStack at DTU web server, which contains a number of interfaced microservices specially tailored for building, maintaining, and serving machine-actionable controlled terminologies (vocabularies, taxonomies, ontologies, etc.).

Similar to Task 32, you should consider making use of OntoStack. Currently, we are spinning it up at the national level (Denmark). If you chose to go in this direction then we can create so-called semantic rings, where concepts from the IEA Task 32 ontology or other ontologies points to yours, and you do vice versa.

You can see how we do this in the wind_speed example by finding property skos:exactMatch, which is rendered at the web site as EXACTLY MATCHING CONCEPTS. In this example we are pointing back to CF Standard Name hosted by mmisw.org.

That is great and looks like quite a lot of work has been put into it. For a version 1.0 of this Data Model that would be something we could do as well as split the concept from the unit as mentioned above. We want to get traction in the industry first with what we have so we will know whether it is worth ploughing more resources into. Thanks.