ioos / system-test

IOOS DMAC System Integration Test project
github.com/ioos/system-test/wiki
The Unlicense
7 stars 14 forks source link

Summarize prior work on Marine Conflict Resolution/Energy siting for possible use as theme 3 #11

Closed dpsnowden closed 10 years ago

dpsnowden commented 10 years ago

Energy/Wind Farm and other offshore siting issues Considering this topic as a third theme. Hannah will summarize some of her earlier work to help us understand the theme and determine if we can design enough scripts/tests to make this a viable subject area.

She has several documents from her dissertation work to share.

hdean83 commented 10 years ago

As a starting point for thinking about energy siting within CMSP as a potential third theme, here's a short paper emphasizing that the driver behind many state based CMSP has been the siting of energy.

The InVEST Model for Wind Energy Siting might be a good place to gather required data sets (See, "Required input" section in the link). Looking through, however, I might argue that it could be additionally useful to add data sets required for permitting (e.g. migratory birds, marine mammals, fish abundance/species).

rsignell-usgs commented 10 years ago

I was thinking today that it would be great see if aggregated met data CFSR, NAM, GFS are accessible via geoportal (or some other catalog service). Seems like this would be relevant to this task, facilitating the eventual comparison of lower-resolution CFSR winds that span decades and higher-resolution IOOS (or other) models that span a few years to available data.

hdean83 commented 10 years ago

I talked with a friend at NOAA's Policy Planning and Analysis Division at NOS over the weekend, and she had a few suggestions on resources to look over, including Tethys which has some documentation on ocean energy pre and post monitoring/information focuses. Oregon state plannning is one of the case studies highlighted on this site, and was one of five states I took a look at as part of my graduate work. The paper hasn't been published, but I've presented the work at a few different poster sessions, and the draft can be found here. As the Tethys site emphasizes, Oregon planning has been driven by a pretty aggressive energy portfolio policy. One thing that is often missing in the process is a decision making process for turning data layering into a ranking process in cases where there are use conflicts. Oregon has also engaged in a STAC Review of Oregon Marine Planning Data which could serve as a potential road map (See the Final Report for more detail)). The Offshore Renewable Energy Primer produced by the Sea Grant might be another way to drive a theme framed by the question of what data sets would be needed to site offshore energy - but this might also run into jurisdictional boundary issues, which is actually what my Masters was on. I could also imagine possibly building on the Five States paper referenced above by surveying Ocean Planning Agencies in the five states with a question along the lines of: What federal data sets do you use on a regular basis for ocean energy planning purposes? and What aspects of each of those data sets are the most important for your work? Compiling that list together, and then developing some metrics based on the second question might be a good approach.

I also talked to Nick Rome and he suggested looking back at Use Cases developed by DMAC (I hadn't seen this before). See also, Use Cases Wiki.

Apologies if I jump ahead of myself, or if these are discussions/thoughts that you've already had. I'll try to keep absorbing DMAC documentation and work to avoid duplicating ideas/approaches/reinventing the wheel.

emiliom commented 10 years ago

Sarah: I know several of the Oregon CMSP data / STAC Review people pretty well. They're terrific. I'm thinking specifically of Tanya Haddad and Andy Lanier, from the Oregon Ocean-Coastal Management Program. Actually, I may be seeing them later this week at a NANOOS meeting in Portland. Anyway, I'd be happy to connect you with them if you'd like.

rsignell-usgs commented 10 years ago

Who is Sarah?

emiliom commented 10 years ago

Multi-tasking is hard! I meant Hannah. So sorry, Hannah!!

hdean83 commented 10 years ago

Emilio, no trouble at all - I am not particular about names. I'm not sure we've cemented CMSP/energy siting as a third theme - but if others think that it's a good idea, maybe you could put me in touch with the Oregon Program people? Thanks!!

emiliom commented 10 years ago

Hannah: Let me know if/when you'd like me to put you in touch with these colleagues. I was in a meeting in Portland with one of them (Tanya Haddad) earlier today, and already gave her a heads up.

hdean83 commented 10 years ago

Emilio, I think I'm ready to be put in contact and maybe have a short conversation with Tanya if she has time. I've been looking over CMSP planning documents with an eye towards mapping data and services to the policy behind energy siting.

On that same note, I was wondering if the group might be open to utilizing something like Lucidchart in order to create visuals that would tie policy to the data. There is a pretty small small cost for a Team Membership which would be most in line with the type of workflow that this team utilizes, but it looks like it would be really useful for standardizing and visualizing the work on this project.

hdean83 commented 10 years ago

Here's an example of what I'm thinking of in terms of mapping data to policy. I put this together this afternoon based on a read through of Coastal and Estuarine Land Conservation Program Final Program Guidelines 2003, with a focus on the data needs that I could pull out from the policy. The next step after creating a map would be creating a notebook that would test the CSWs for the layers needed, which might involve first creating and thinking about each of the blue and aqua data boxes/layers separately. I'm suspecting that a notebook would demonstrate that some of these layers are in existence and others are not - reflecting a disconnect between policy demands and data availability, and also types of data where the real burden is placed on state/local planners.

rsignell-usgs commented 10 years ago

I don't see any IOOS sensor or modeling data in that example.
Did I miss it? Or is that the point? To branch out into other non-traditional IOOS data?

hdean83 commented 10 years ago

So, the map would be a first level - that is the language that's in the policy document regarding data. I'd then break it down into charts mapping both IOOS and non-IOOS data to each data layer (each blue or aqua box) . Ultimately that would demonstrate gaps in the IOOS data system in terms of a series of ocean/coastal planning policies. I chose CELC as an example - we could choose this or another coastal/marine planning policy in order to build out as a third theme.

This approach might create a model for IOOS analysis of policy related to data layers, if that makes sense - with an associated notebook or set of notebooks for the given policy in order to test layer availability within IOOS, and also demonstrate those layers that are still outside the IOOS system or just not available through federal registries (e.g. data on local or state regional watershed plans). Does that make sense? Or is that biting off too much?

rsignell-usgs commented 10 years ago

I think I need to have a chat with you and Derrrick. I feel out of the loop.

hdean83 commented 10 years ago

I was just throwing this out as an idea as a way to visualize and organize the energy siting idea - it seems like working off of individual policies that would be involved in offshore energy siting might be a way of breaking it down, given the many policies involved in energy siting. But, it's just an idea.

jkupiec commented 10 years ago

Apologies if I have misinterpreted this thread. I think it's a good idea that needs to be discussed and evaluated to ascertain whether it fits within Derrick's understanding of the project scope. It's a good way to treat availability of data and metadata from a macro, policy-oriented perspective. And it seems like a perspective that a user might have. But I have a feeling that major aspects of this approach may place it out of scope. We need to know if the metadata that registries say they advertise is indeed available in the formats they claim, and that the data repositories cataloged by the registry metadata can be successfully accessed. I don't know that identifying metadata and repository data outside of IOOS is germane to this effort. It certainly is germane in the real world. But we first have to answer the question, "Does the damn thing work" before anyone can use it to find out what might and what might not be available.

On Tue, Feb 25, 2014 at 4:03 PM, hdean83 notifications@github.com wrote:

I was just throwing this out as an idea as a way to visualize and organize the energy siting idea - it seems like working off of individual policies that would be involved in offshore energy siting might be a way of breaking it down, given the many policies involved in energy siting. But, it's just an idea.

Reply to this email directly or view it on GitHubhttps://github.com/ioos/system-test/issues/11#issuecomment-36057716 .

emiliom commented 10 years ago

Regardless of where this topic goes (if anywhere) ... Hannah, I'm pulling Tanya Haddad @tchaddad here. I know she has a major project deadline coming up in the next 10 days or so, but you can follow up with her directly.

Also, if you do decide to pursue this topic with Oregon, the new West Coast Ocean Data Portal (http://portal.westcoastoceans.org) probably has relevant GIS (static) datasets from OR (again, most of them provided by Tanya's office), including WMS and other geospatial services. This catalog has a Geoportal backend, so it's available via CSW queries; I can provide sample notebooks querying that CSW catalog. See here for connection info: http://portal.westcoastoceans.org/connect/ And here's a sample user query for "oregon" and "energy", to give you an idea of datasets available: http://portal.westcoastoceans.org/discover/#?text=energy%20oregon Note that this portal is an aggregator catalog; it hosts no data directly. Also, we'd love input on how to make the web service advertisement in the CSW response more standard-compliant, or consistent with IOOS and NGDC.

hdean83 commented 10 years ago

If I end up using Lucidchart, one outcome could be a series of policy/data map products for each Federal Statute involved in Energy Siting(See Also, Planning for Offshore Energy Development 2013). Alternatively, I could use an Energy Siting EIS as a starting point and create a notebook that pulls down datasets available from the IOOS system and tests them for the metrics that we decide on for the baseline theme notebook (See Section III. Description of the Affected Environment for data examined in the Linked EIS; See Also, Searchable EIS Statements).

dpsnowden commented 10 years ago

@emiliom I added the WCGA portal to our running list of candidate metadata portals wiki/Service-Registries-and-Data-Catalogs and I also added the noaa.data.gov CSW interface provided by Micah.

@jkupiec Regarding two statements you made, I disagree.

  1. "We need to know if the metadata that registries say they advertise is indeed available in the formats they claim, and that the data repositories cataloged by the registry metadata can be successfully accessed."
    • Not really. We will ascertain through the course of this test the answer to questions like: do metadata registries have records with accurate references to services? We don't know this ahead of time, we figure it out by trying to access the records. If we cannot, then we log the issue and move on and if we can, then we attempt to use the data to answer the question.
  2. "I don't know that identifying metadata and repository data outside of IOOS is germane to this effort."
    • I think this question is within the scope of IOOS. That is very different than saying all data/metadata relevant to Hannah's question is/should be in the IOOS registry. The questions should drive the registry choice not the other way around.

@hdean83 I think that using graphics as a way of organizing the questions and presenting your ideas is good. I'm out of my depth in the policy arena so I'm like Rich in struggling to visualize where this is going, though I think you've uncovered something with potential. Can you organize a few simple questions to help us see what the queries might look like and what sort of data product you'd hope to generate. Don't make a master notebook to do it all, just show us one simple example in words, then we can translate to Python. Also, try to look for things that are emphasizing ocean observations/models. It looks like this particular example of CELC plans is creeping further up on to dry land than we feel comfortable.

jkupiec commented 10 years ago

Then how do we know whether a test has been executed successfully? Getting something as a result of a script does not mean that what we get is what we expected to get or intended to get. To put it in question form: When we get a result, how do we know whether the result is germane to the questions included in the script? How do we know that what we got is everything that the queried source has that is germane to our request/query?

If we say that any meaningful result constitutes success, then what sort of bug fix requests might be generated, other than "Query returned no data"?

On Thu, Feb 27, 2014 at 10:19 AM, Derrick Snowden notifications@github.comwrote:

@emiliom https://github.com/emiliom I added the WCGA portal to our running list of candidate metadata portals wiki/Service-Registries-and-Data-Catalogshttp://../wiki/Service-Registries-and-Data-Catalogsand I also added the noaa.data.gov CSW interface provided by Micah.

@jkupiec https://github.com/jkupiec Regarding two statements you made, I disagree.

  1. We need to know if the metadata that registries say they advertise is indeed available in the formats they claim, and that the data repositories cataloged by the registry metadata can be successfully accessed.
  2. Not really. We will ascertain through the course of this test the answer to questions like: do metadata registries have records with accurate references to services? We don't know this ahead of time, we figure it out by trying to access the records. If we cannot, then we log the issue and move on and if we can, then we attempt to use the data to answer the question.
  3. I don't know that identifying metadata and repository data outside of IOOS is germane to this effort.
  4. I think this question is within the scope of IOOS. That is very different than saying all data/metadata relevant to Hannah's question is/should be in the IOOS registry. The questions should drive the registry choice not the other way around.

@hdean83 https://github.com/hdean83 I think that using graphics as a way of organizing the questions and presenting your ideas is good. I'm out of my depth in the policy arena so I'm like Rich in struggling to visualize where this is going, though I think you've uncovered something with potential. Can you organize a few simple questions to help us see what the queries might look like and what sort of data product you'd hope to generate. Don't make a master notebook to do it all, just show us one simple example in words, then we can translate to Python. Also, try to look for things that are emphasizing ocean observations/models. It looks like this particular example of CELC plans is creeping further up on to dry land than we feel comfortable.

Reply to this email directly or view it on GitHubhttps://github.com/ioos/system-test/issues/11#issuecomment-36252165 .

dpsnowden commented 10 years ago

OK I see your point now. This is why it's important to start getting more questions on paper so that we can begin to find people to help answer what you're getting at. The four of us are still struggling to understand the objectives of this project so we really need to nail this down together. We need to be able to describe this so that others know what we're doing and how they can contribute and I don't think we're there yet.

Here's how I think of your question.

First a statement: We will never know the definitive size of the data universe so we'll never be able to say with 100% confidence that we got all the data.

But that's not to say there aren't questions we can answer that are useful. Well constructed questions, especially the simple ones, will be self evident. If the question is, can I plot modeled data on top of observational data, then the answer is a fairly obvious judgement call. The answer is yes or no. Your script returned at least one modeled data set and at least one observational data set and you were able to make a plot then your question was answered. What if the question is much more specific: can I plot the CO-OPS observed water levels for the last ten days on top of the the water levels forecast by the ESTOFS model for the same time period? This is a much more involved in a script to answer this, but again, I think the answer is a fairly straightforward yes/no. One final variation might be: can I plot all observed water levels for New Jersey coastal stations on top of the ESTOFS model output? We can't answer this definitively. We can try as many registries as we know about and collect as much observational data as we can find. But our final analysis might only be able to say we searched 10 places and found 20 stations and we could plot 14 of those stations. The next step is to figure out why we could only plot 14 instead of 20.

I think of it like the census. We construct enough simple questions to make general statements about the state of this data distribution network. We also come up with a few (or many) specific things that need to be fixed. We don't count every person (or data set). We're relying on probability a little bit.

I'm open to suggestions on altering this viewpoint but I really need it on paper somewhere so that we can communicate about it consistently.

jkupiec commented 10 years ago

Sounds good. After I posted my last comment and questions, I began to consider the project in different terms, not as a classical systems test, but as a project of discovery, in which we perform a survey of known registries and attempt to access repositories identified by the registries, and in doing so, see what we can see. If the results look reasonable, then we note that there is basic functionality for discovery, access and use. I looked at Rich's recent Issue-related messages, and it appears that he undertaking just such a survey.

Perhaps the following are some of the more general, less specific questions our project is asking:

Are there resources out there which we can use to discover, access and use data collected by NOAA, Geological Survey, and other members of the IOOS community? Obviously, the answer is yes.

Can we use commonly used clients to successfully discovery, access and use data from IOOS community resources? In most cases, the current answer is yes. Our objective is by the end of this project we can answer yes in all cases.

Can we shape our discovery, access and use queries such that we can obtain data that are useful in addressing particular scientific and policy questions? We are beginning to answer this question at a more granular level. An objective of this project is to expand the "yes" from a granular level to a more general level.

Can we offer the science and policy communities a set of clients that we know produce "good" data? We are in the process of doing so.

Can we assure the science and policy communities that we have worked with registry and repository owners to ensure that the metadata and data they accumulate and conserve are generally accessible and retrievable with the clients we have used? I see Rich and Hannah working on that objective right now.

Let me know whether you think my perceptions are accurate.

Bottom Line: I think that we need to get away from using the word "test," and use terms more closely associated with census and survey.

On Thu, Feb 27, 2014 at 12:22 PM, Derrick Snowden notifications@github.comwrote:

OK I see your point now. This is why it's important to start getting more questions on paper so that we can begin to find people to help answer what you're getting at. The four of us are still struggling to understand the objectives of this project so we really need to nail this down together. We need to be able to describe this so that others know what we're doing and how they can contribute and I don't think we're there yet.

Here's how I think of your question.

First a statement: We will never know the definitive size of the data universe so we'll never be able to say with 100% confidence that we got all the data.

But that's not to say there aren't questions we can answer that are useful. Well constructed questions, especially the simple ones, will be self evident. If the question is, can I plot modeled data on top of observational data, then the answer is a fairly obvious judgement call. The answer is yes or no. Your script returned at least one modeled data set and at least one observational data set and you were able to make a plot then your question was answered. What if the question is much more specific: can I plot the CO-OPS observed water levels for the last ten days on top of the the water levels forecast by the ESTOFS model for the same time period? This is a much more involved in a script to answer this, but again, I think the answer is a fairly straightforward yes/no. One final variation might be: can I plot all observed water levels for New Jersey coastal stations on top of the ESTOFS model output? We can't answer this definitively. We can try as many registries as we k now about and collect as much observational data as we can find. But our final analysis might only be able to say we searched 10 places and found 20 stations and we could plot 14 of those stations. The next step is to figure out why we could only plot 14 instead of 20.

I think of it like the census. We construct enough simple questions to make general statements about the state of this data distribution network. We also come up with a few (or many) specific things that need to be fixed. We don't count every person (or data set). We're relying on probability a little bit.

I'm open to suggestions on altering this viewpoint but I really need it on paper somewhere so that we can communicate about it consistently.

Reply to this email directly or view it on GitHubhttps://github.com/ioos/system-test/issues/11#issuecomment-36266811 .

hdean83 commented 10 years ago

@dpsnowden I created the following OCSLA diagram. I based this off of the 2012-2017 Programmatic Environmental Impact Statement. I think I can improve it, but wanted to provide a response.

Basically, the question for this policy would be: Can we create a notebook that pulls down the data sets outlined in the EIS and in the diagram from the available registries using the Gulf of Mexico AND/OR the Chukchi/Beaufort Sea as bounding boxes, given that these are on the 2012-2017 Lease Sale Schedule.

dpsnowden commented 10 years ago

Link to the picture is broken. It looks like it points to somehting on your computer, not on the web. You should be able to just drag and drop the picture onto the comment to upload.

hdean83 commented 10 years ago

Sorry about that - it should be accessible now.

hdean83 commented 10 years ago

I'm amending language on this theme coming out of the Regional IOOS meeting and will work on summarizing the documents I've been looking at in a way that will emphasize the Ecosystem aspects.

jkupiec commented 10 years ago

Unless anyone has objections, I think sufficient work has been done on this theme and its scenarios to merit closure.