cogent3 / Cogent3Workshop

Materials for the Phylomania workshop
BSD 3-Clause "New" or "Revised" License
8 stars 4 forks source link

Learning outcomes #3

Open GavinHuttley opened 11 months ago

GavinHuttley commented 11 months ago

Know

Technology - computer setup pages(s), participants to do it before attending) - installing / updating cogent3 (PyPl or GitHub). (Advice from Peter M was to do it via conda, but this won't work on macos if users don't have homebrew + xcode tools.) @ pre-meeting - intro to PyPI and GitHub - intro to pip - intro to conda - explanation of virtual environments (what and why) - how to ask for help (GitHub Discussions) - Raise issues, contribute (Issues, c3dev) - backup get jupyterhub working
LO Understanding experimental design issue What they need to consider in choosing sequences for study. Reproducible computation -- `scitrack` - different sequence types and relevance for experimental design - different sequence relationship types
LO - Getting data - Published GenBank ID's (e.g. REFSOIL) - Published (already aligned) data set (Duchene et al example) - Ensembl downloading and Installing
LO - Sampling ensembl - Downloading - Installing - Data summaries
LO - Identifying and dealing with data issues - inconsistent meta-data (data wrangling REFSOIL GenBank files) - demonstrate using `annotation_db` - explore using dotplots - File formats issues - Duchene phylip formats, solving using `bad_phylip` app - extremely long fasta sequence labels (e.g. making sure you can collate genomes from one species)
LO - sampling sequence classes Ensembl - sampling homologous sequences - sampling alignments
LO - Alignments - using cogent3 - quantifying alignment quality - visualisation
LO - Sampling alignments - selecting by length - codon positions - consistent species presence
LO - Unsolved / Important problems - alignment quality scores! - pair and multiple
GavinHuttley commented 11 months ago

Notes

We don't want students competing with each other (bandwidth wise) on a wifi network to download large volumes of data. So we will need example "download" configs that allow them to download of a small amount of data. We will need already downloaded larger data sets, and already "installed" larger data sets which they can grab. (Noting here that the "installed" data sets are much smaller than the original downloads.)

GavinHuttley commented 11 months ago

We could reframe this as:

khiron commented 11 months ago

Technology items all transferred to individual issues and assigned to @khiron

13

14

15

16

17

18

19

20

21