Closed hpages closed 1 year ago
Hi @hpages Please can I work on this issue?
Hi @Priceless-P I just assigned you. It's all yours now!
@hpages I think I have been able to make it work! I finally got forgeBSgenomeDataPkg()
, R CMD build
R CMD check
, and R CMD install
to all run with no errors. Apart from a few errors I encountered earlier which I found most of the solutions on Bioconductor support page, this vignette you wrote was everything I needed. I also had to look at other seed files to further deepen my understanding.
So far, everything seems okay to me, but I have just one little question. I created the project on my local machine. The location is /Users/prisca/Desktop/
. The folders generated are BSgenome.Cfamiliaris.UCSC.canFam6/
, and canFam6/
(where each FASTA file for each chromosome is located). The files are canFam6-seed
which is the seed file and the tarball
file usually generated by R CMD build
.
My question is what folders and files should I upload to my fork of the repo and in what location? I'm guessing I need to upload BSgenome.Cfamiliaris.UCSC.canFam6/
and canFam6-seed
to BSgenome/inst/extdata/GentlemanLab/
but I'm not sure
Hi @Priceless-P,
Sounds like you did a really good job at digging around and finding all the information you needed. Congrats!
If you are confident that your BSgenome data package works as expected, please add your seed file to the inst/extdata/Outreachy/
folder of the BSgenome software package. You'll need to fork the BSgenome repository for that, then add the seed file, commit, push, and submit a PR. (I just edited the IMPORTANT NOTES TO OUTREACHY APPLICANTS above to add these steps.)
Thanks, H.
PS: Surprise!! https://bioconductor.org/packages/devel/GenomeInfoDb :partying_face:
PS: Surprise!! https://bioconductor.org/packages/devel/GenomeInfoDb 🥳
Wow!!! I'm so honored 💃 I'm putting this up on my LinkedIn! Thank you so much @hpages
Sounds like you did a really good job at digging around and finding all the information you needed. Congrats!
Thank you, Wasn't so hard. Thanks to you, the support is amazing!
If you are confident that your BSgenome data package works as expected, please add your seed file to the inst/extdata/Outreachy/ folder of the BSgenome software package. You'll need to fork the BSgenome repository for that, then add the seed file, commit, push, and submit a PR. (I just edited the IMPORTANT NOTES TO OUTREACHY APPLICANTS above to add these steps.)
Okay, I just created a pull request. Please take a look.
Hi @Priceless-P,
I just merged PR #46.
Don't miss my long due explanation about PkgExamples
: https://github.com/Bioconductor/BSgenome/pull/46#issuecomment-1291424086 Don't hesitate to ask if you have any questions.
Next task in your group is #39. It's still about Dog! :dog2: Whenever you are ready, go there and ask to be assigned.
Don't forget to record your contributions on Outreachy at https://www.outreachy.org/outreachy-december-2022-internship-round/communities/bioconductor/refactor-the-bsgenomeforge-tools/contributions/.
Sure.
Thanks @hpages
This task depends on this issue being completed first (i.e. PR accepted and merged, and issue closed). Although it's not a requirement that the 2 tasks be completed by the same applicant, it will be a more interesting learning experience if they are.
BSgenome data packages are one of the many types of annotation packages available in Bioconductor. They contain the genomic sequences, which comprise chromosome sequences and other DNA sequences, of a particular genome assembly for a given organism. For example BSgenome.Hsapiens.UCSC.hg19 is a BSgenome data package that contains the genomic sequences of the
hg19
genome from UCSC. Users can easily and efficiently access the sequences, or portions of the sequences, stored in these packages, via a common API implemented in the BSgenome software package.This task's goal is to make a new BSgenome data package for UCSC genome
canFam6
. The process of making such package is documented in the "How to forge a BSgenome data package" vignette from the BSgenome software package. The landing page for the BSgenome package contains a link to this vignette.Other useful links:
BSgenome issues on GitHub where many Bioconductor users who went thru the process of forging a BSgenome data package have already asked questions about this process. Note that most issues where those questions have been asked (and answered) are now closed so do not exclude closed issues from your research.
Some users have also asked questions about this process on the Bioconductor support site. See questions tagged with "BSgenome" there: https://support.bioconductor.org/tag/bsgenome/
The BSgenome.Hsapiens.UCSC.hg19 landing page: https://bioconductor.org/packages/BSgenome.Hsapiens.UCSC.hg19
List of BSgenome data packages available in Bioconductor: https://bioconductor.org/packages/release/BiocViews.html#___BSgenome
https://github.com/Bioconductor/BSgenomeForge for more information about BSgenome data packages and additional links.
IMPORTANT NOTES TO OUTREACHY APPLICANTS:
R CMD build
andR CMD check
on the package. Note thatR CMD check
should always be run on the source tarball produced byR CMD build
.R CMD check
might produce some NOTEs and even some WARNINGs. Let me know if that's the case and we'll discuss them.inst/extdata/Outreachy/
folder of the BSgenome software package. You'll need to fork the BSgenome repository for that, then add the seed file, commit, push, and submit a PR.