ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

Onboarding tasks for Anu #976

Closed ESapenaVentura closed 1 year ago

ESapenaVentura commented 2 years ago

Tasks

Welcome to the HCA DCP! We are pleased to introduce you to the data wrangling team. Before beginning to work by yourself, you'll be asked to complete a set of tasks and meetings. These will guide you through both the general and specifics of your role, and it will help us make sure you understand how we work in the team. Your introductory tasks are:

General tasks

These are more wrangling-specific tasks. They will provide you with the basic knowledge necessary to perform your future tasks in the team. These include:

  1. Watch HCA Metadata Standard 101 video (slides).
  2. Familiarise yourself with the projects and Standard Operating Procedures (SOPs) inside the hca-ebi-wrangler-central repository
  3. Read up on FAIR Data Principles
  4. Add or update tickets in wrangling or metadata-schema repo to reflect work being done.

Specific role tasks

These tasks will give you an insight on your role, and provide you with the appropriate tools to proceed in your job. As your first dataset, you have been assigned to wrangle the publication Single-Cell RNA Sequencing in Multiple Pathologic Types of Renal Cell Carcinoma Revealed Novel Potential Tumor-Specific Markers. To familiarise yourself to the wrangling process, your tasks regarding this dataset are:

Onboard Meetings

These meetings will help you understand some aspects of the work here, as well as our work policy and important details about what technologies we use. They will be short (around 30 minutes) and you should try to get at least one per day. You will be expected to book the meeting with the person that is supposed to give it.

  1. Introduction to the DCP - @gabsie
  2. AIT organisation; team days, overall routine - @gabsie
  3. Introduction to agile: The manifesto and basic concepts. Technical Lead (TL), Product Owner (PO), sprint, tickets and story points. Zenhub walk-through and standup format - @amnonkhen
  4. Computational environments: DCP's Github workflow, recommended software (PyCharm, sublime, google sync, notes, postman). @ESapenaVentura
  5. Developmental environments: Development, Staging, Production. @amnonkhen
  6. Metadata repository 1: Entities structure: core, type and module. Graphs as a tool to represent a project. @ESapenaVentura
  7. Metadata repository 2: Introduction to JSON, schema evolution and versioning. @ESapenaVentura
  8. Overview of metadata schema updates - @ESapenaVentura
  9. Technologies: 10x, SMART-Seq2, Imaging transcriptomics, CITE-Seq, ATAC-Seq, 10x Visium TBD
  10. Intro to wrangling for downstream portals @Wkt8
  11. Wrangling repository: Tools, tickets, MarkDown documents and processes. Graph validator. @ESapenaVentura
  12. The spreadsheets and the contributor form. General tour and how to generate one. @ESapenaVentura
  13. Linked knowledge: Quick introduction to ontologies. TBD
  14. Components: DCP and organisation. @gabsie
  15. Ingest component: Overview of ingest infrastructure. Dev TBD

Acceptance criteria

anu-shiva commented 1 year ago

Notes