Bioconductor / BSgenome

Software infrastructure for efficient representation of full genomes and their SNPs
https://bioconductor.org/packages/BSgenome
9 stars 8 forks source link

Proposed task for Outreachy applicants: Forge BSgenome data package for UCSC genome gorGor6 #44

Open hpages opened 2 years ago

hpages commented 2 years ago

This task depends on this issue being completed first (i.e. PR accepted and merged, and issue closed). Although it's not a requirement that the 2 tasks be completed by the same applicant, it will be a more interesting learning experience if they are.

BSgenome data packages are one of the many types of annotation packages available in Bioconductor. They contain the genomic sequences, which comprise chromosome sequences and other DNA sequences, of a particular genome assembly for a given organism. For example BSgenome.Hsapiens.UCSC.hg19 is a BSgenome data package that contains the genomic sequences of the hg19 genome from UCSC. Users can easily and efficiently access the sequences, or portions of the sequences, stored in these packages, via a common API implemented in the BSgenome software package.

This task's goal is to make a new BSgenome data package for UCSC genome gorGor6. The process of making such package is documented in the "How to forge a BSgenome data package" vignette from the BSgenome software package. The landing page for the BSgenome package contains a link to this vignette.

Other useful links:

IMPORTANT NOTES TO OUTREACHY APPLICANTS: