cyber2a / cyber2a-course

Online materials for the Cyber2A course on AI for Arctic research
https://cyber2a.github.io/cyber2a-course/
Apache License 2.0
0 stars 0 forks source link

Lesson - Data annotations for deep learning #9

Closed carmengg closed 3 months ago

carmengg commented 8 months ago

Data annotations for deep learning

Goal

To provide participants with a comprehensive understanding of the importance of training data, methods to obtain it, tools for annotation, and potential data sources.

Breakdown

  1. Introduction to Training Data
    • What is training data and why is it crucial?
    • Differences between labeled and unlabeled data
  2. The Importance of Quality Annotations
    • How annotations impact model performance
    • Common challenges: Inconsistent annotations, class imbalance, etc.
    • Strategies to ensure high-quality annotations: Guidelines, multiple annotators, quality checks
  3. Methods to Obtain Training Data
    • Creating your own dataset: Pros, cons, and considerations
    • Using pre-existing datasets: Benefits and potential pitfalls
    • Data augmentation: Expanding dataset size and diversity
    • Transfer learning and pre-trained models: Leveraging external knowledge
  4. Annotation Tools
    • Overview of popular annotation tools: Labelbox, VGG Image Annotator (VIA), RectLabel, etc.
    • Features to consider: Collaboration, format export options, automation capabilities
    • Hands-on demo: Annotating a sample image using a chosen tool
  5. Data Sources for RTS and Arctic Science (15 minutes)
    • Public datasets relevant to arctic science and RTS
    • Collaborative efforts and data-sharing initiatives in the research community
    • Ethical considerations: lesson 12
  6. Q&A and Discussion
    • Encouraging sharing of personal experiences or challenges with data annotation
    • Discussing potential future developments in annotation tools and techniques