Closed hpages closed 1 year ago
I have completed all the preliminary tasks
Great!
I just assigned you to this issue. Do not hesitate to ask questions, I'll do my best to help.
Hi @hpages, Can I work on this issue once I'm done with this ?
@kakopo PR #88 merged, thanks!
gorGor6
is the latest UCSC genome for Gorilla (Gorilla gorilla gorilla). See "List of UCSC genome releases" at https://genome.ucsc.edu/FAQ/FAQreleases.html for all the genomes currently supported by UCSC.Also check out the "Genome Browser Gateway" page here. This is the main entrance to the "UCSC Genome Browser". Find Gorilla in the UCSC species tree on the left, click on it, then make sure to select the latest Gorilla Assembly (
gorGor6
). This will display a bunch of additional information about thegorGor6
assembly.Note that many UCSC genomes are already registered in the GenomeInfoDb package (83 as of October 2022). The
registered_UCSC_genomes()
function in GenomeInfoDb returns the list of all the UCSC genomes that are currently registered in the package. An important thing to be aware of is thatgetChromInfoFromUCSC()
still works on an unregistered genome, but in "degraded" mode, that is:assembled.molecules
argument is ignored,assembled
andcircular
columns of the returned data.frame are filled withNA
s,Registering a genome fixes that. In other words, once a genome is registered in GenomeInfoDb, the information returned by
getChromInfoFromUCSC()
for that genome is guaranteed to be complete and accurate.See
?getChromInfoFromUCSC
(after loading GenomeInfoDb) for more information.Registering a new UCSC genome is only a matter of adding a new file, called "registration file", to
GenomeInfoDb/inst/registered/UCSC_genomes/
. Note that the folder contains aREADME.TXT
file that provides some brief information about what a "registration file" should contain (unfortunately the registration process is not fully documented).For
gorGor6
, since this is the firstgorGor
genome that we're going to register in GenomeInfoDb, we need to start thegorGor6.R
file from scratch. However, looking at other registration files to get a feeling of how things are done is always a good idea. Don't bother with theNCBI_LINKER
component for now. We'll add it later, once the corresponding NCBI assembly (Kamilah_GGO_v0
) is also registered (registeringKamilah_GGO_v0
is the topic of issue #61).IMPORTANT NOTES TO OUTREACHY APPLICANTS:
R CMD build
andR CMD check
on the package. Note thatR CMD check
should always be run on the source tarball produced byR CMD build
.R CMD check
might produce some NOTEs and even some WARNINGs. These are ok if they existed before your changes. You can check that by taking a look at the daily report produced by our automated builds here: https://bioconductor.org/checkResults/devel/bioc-LATEST/ Make sure to not introduce new NOTEs or WARNINGs!