ACDguide / Governance

Main repository for Climate data guidelines working group
https://acdguide.github.io/Governance
Other
3 stars 3 forks source link

adding section on Creating data products #73

Open paolap opened 11 months ago

paolap commented 11 months ago

application ready data data products finalised to be usable in other disciplines, stakeholders etc., so tailored to the intended user needs. This might mean converting existing data to suitable formats, adding extra quality control how is packaged, tailored to different uses and documented in a way that errors and quality are well documented as the "clients" are non-expert user centric

hot007 commented 11 months ago

From @AliciaTak , might be good to rewrite in our own words (hah!) and include!

I asked ChatGPT how to create application-ready RCM data, and here are the steps it gave me 😊

Step 1: Data Collection and Download

  • Identify the RCM simulations or datasets relevant to your research or application.
  • Obtain the RCM data from reputable sources, such as climate modeling centers or research institutions. Ensure that the data aligns with your study area, resolution, and time period of interest.

Step 2: Data Preprocessing and Quality Control

  • Check and handle missing values, inconsistencies, or outliers in the RCM data.
  • Perform quality control checks to ensure the integrity and reliability of the data.

Step 3: Spatial Subset and Resampling

  • If the RCM data covers a larger region than needed, spatially subset it to the specific study area of interest.
  • If your application requires merging RCM data with other datasets of different resolutions, perform spatial resampling or interpolation to align them.

Step 4: Temporal Aggregation and Time Synchronization

  • Depending on your application, aggregate the RCM data to the desired temporal resolution (e.g., daily, monthly, seasonal means) to match the timescale required for your analysis.
  • Ensure that all datasets are synchronized and aligned in time, accounting for any time lags or offsets.

Step 5: Bias Correction (if needed)

  • Apply bias correction techniques, if necessary, to address any systematic errors or biases in the RCM data compared to observed climate data.

Step 6: Coordinate Reference System (CRS) Standardization

  • Check the coordinate reference system of the RCM data and convert it to a standardized CRS (e.g., WGS 84) if required for compatibility with other datasets or GIS applications.

Step 7: Metadata Documentation

  • Document all the essential metadata for the RCM data, including model information, downscaling methods, time period, spatial resolution, and any preprocessing steps undertaken.

Step 8: Data Format and Storage

  • Save the processed RCM data in an appropriate format, such as NetCDF, which is commonly used for climate data storage and analysis.

Step 9: Data Validation and Evaluation

  • Perform validation and evaluation exercises to assess the performance and accuracy of the RCM data against observations or benchmark datasets.

Step 10: Data Sharing and Dissemination

  • Make the application-ready RCM data available to the scientific community and other stakeholders by sharing it through climate data portals, data repositories, or other relevant platforms.

By following these steps, you will create high-quality, application-ready Regional Climate Model data that is well-prepared for use in various climate research and applications, such as climate impact assessments, climate change projections, or coupled modeling studies.

When I ask it how to create application-ready climate products, I get the following:

Step 1: Define the Climate Product

  • Clearly define the purpose and scope of the climate product. Determine what specific information or insights the product aims to provide and the target audience.

Step 2: Data Collection and Selection

  • Identify the relevant climate data required to create the product. This could include observational data, climate model outputs, reanalysis data, or other relevant sources.
  • Select the appropriate datasets that align with the product's goals and cover the relevant spatial and temporal scales.

Step 3: Data Preprocessing and Quality Control

  • Preprocess the climate data to handle missing values, outliers, and any data inconsistencies.
  • Conduct quality control checks to ensure the data's integrity and reliability.

Step 4: Data Analysis and Derivation

  • Perform data analysis and derive the necessary climate indicators or variables required for the product.
  • Apply relevant statistical or climatological methods to compute meaningful information, such as long-term averages, anomalies, trends, or extreme events.

Step 5: Visualization and Plotting

  • Create clear and visually appealing visualizations that effectively communicate the climate information.
  • Use appropriate maps, graphs, and charts to highlight key patterns, trends, and variations in the data.

Step 6: Product Packaging and Formatting

  • Organize the climate product in a well-structured format that is easy for users to understand and navigate.
  • Ensure that the product includes clear titles, legends, units, and metadata to provide context and interpretability.

Step 7: User Interface (if applicable)

  • If the climate product is interactive or web-based, design a user-friendly interface that allows users to explore and interact with the data easily.
  • Consider providing user-guides or tooltips to assist users in interpreting the information.

Step 8: Documentation and User Support

  • Prepare comprehensive documentation that explains the methodology, data sources, processing steps, and limitations of the climate product.
  • Offer user support, such as FAQs or contact information, to help users interpret and utilize the product effectively.

Step 9: Validation and Sensitivity Analysis

  • Validate the climate product by comparing it with other reliable datasets or observations.
  • Conduct sensitivity analyses to assess the product's robustness and reliability under different scenarios.

Step 10: Data Distribution and Sharing

  • Make the application-ready climate product available to the intended audience through online platforms, data portals, or publication in scientific journals.
  • Consider licensing and copyright issues if applicable.

By following these steps, you will create high-quality, application-ready climate products that provide valuable insights and information to users, facilitating their understanding of climate patterns, trends, and variations.

AliciaTak commented 11 months ago

@paolap Here are slightly different steps when I reword the question (what to do with raw RCM outputs?), e.g., steps 9., 10., 12. were not included in the precious prompts but are important - especially step 12.

Step 1: Data Collection and Preprocessing

Step 2: Spatial and Temporal Subset

Step 3: Data Bias Correction (if needed)

Step 4: Calculate Climate Indices

Step 5: Data Aggregation and Averaging

Step 6: Spatial Interpolation (if needed)

Step 7: Data Visualization and Analysis

Step 8: Quality Control and Validation

Step 9: Documentation

Step 10: Create User-Friendly Climate Products

Step 11: Data Distribution and Sharing

Step 12: Continual Update and Improvement

paolap commented 11 months ago

While this is great for the moment I would rather just add a generic scope of the guidelines. This is very useful but it is a very specific use case, so I'm hesitant to write it down as it is. What I could do is to pick the generic steps as points to cover, for example:

This guidelines should cover:

Leaving out completely things like "Step 4: Calculate Climate Indices" and the "sub-steps" as they're specific to that example.

AliciaTak commented 11 months ago

Yeah, absolutely @paolap! I just thought I share the prompt about the definition of "application-ready" data and what it believes it involves. I think we also need to be mindful regarding copy-right issues as it is not clear who is the owner of the prompts :)

paolap commented 11 months ago

Great, it's still good to have the longer and detailed list here for later on. And yes I will add copyright/license etc...

paolap commented 11 months ago

Ok I've had a go at writing the intro and will put a pull request soon. I re-organised the steps in 4 broader categories, let me know what you think

paolap commented 9 months ago

Just realised we didn't add this guidelines to the intro page

paolap commented 1 month ago

I finally added something about this to the introduction page, I also pointed out that this is a work in progress and we welcome input from others who have some experience/ideas. In branch paola_isssues

AliciaTak commented 1 month ago

Thanks, Paola!