Bioconductor / Contributions

Contribute Packages to Bioconductor
131 stars 33 forks source link

muleaData ExperimentHubData Bioconductor package #3291

Closed barizona closed 3 months ago

barizona commented 4 months ago

Dear BioC Team,

Here we provide the muleaData ExperimentHubData Bioconductor package for the mulea (ELTEbioinformatics/mulea on github) R package. mulea is a comprehensive overrepresentation and functional enrichment analyser R package which reads ontologies (gene and protein sets) in a standardised GMT (Gene Matrix Transposed) format. We provide these GMT files for 27 different model organisms, ranging from Escherichia coli to human, all acquired from publicly available data sources. The GMT files are provided with multiple gene and protein identifiers such as UniProt protein IDs, Entrez, Gene Symbol, and Ensembl gene IDs. The GMT files and the scripts we applied to create them are available at the _GMT_files_formulea (ELTEbioinformatics/GMT_files_for_mulea on github) repository. For the muleaData we read these GMT files with the mulea::read_gmt() function and saved it to .rds files with the standard R saveRDS() function. I've got the SAS token from Lori Shepard and uploaded the .rds files to the Azure Data Lake. She wrote that the data is now added:

> query(eh, "muleaData")
ExperimentHub with 879 records
# snapshotDate(): 2024-02-07
# $dataprovider: muleaData
# $species: Drosophila melanogaster, Homo sapiens, Mus musculus, Caenorhabdi...
# $rdataclass: data.frame
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["EH8571"]]' 

           title                                                              
  EH8571 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_EnsemblID.rds
  EH8572 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_EntrezID.rds 
  EH8573 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_GeneSymbol...
  EH8574 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_UniprotID.rds
  EH8575 | Genomic_location_Ensembl_Arabidopsis_thaliana_20genes_EnsemblID.rds
  ...      ...                                                                
  EH9445 | Genomic_location_Ensembl_Zea_mays_5genes_UniprotID.rds             
  EH9446 | Protein_domain_PFAM_Zea_mays_EnsemblID.rds                         
  EH9447 | Protein_domain_PFAM_Zea_mays_EntrezID.rds                          
  EH9448 | Protein_domain_PFAM_Zea_mays_GeneSymbol.rds                        
  EH9449 | Protein_domain_PFAM_Zea_mays_UniprotID.rds         

Best wishes, Eszter Ari

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 4 months ago

Hi @barizona

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: muleaData
Title: Genes Sets for Functional Enrichment Analysis with the 'mulea' R Package
Version: 0.99.0
Date: 2023-11-20
Authors@R: c(
    person("Eszter", "Ari", email = "arieszter@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0001-7774-1067")),
    person("Márton", "Ölbei", role = c("aut"),
           comment = c(ORCID = "0000-0002-4903-6237")),
    person("Lejla", "Gul", role = c("aut")),
    person("Balázs", "Bohár", role = c("aut"),
           comment = c(ORCID = "0000-0002-3033-5448")))
Description: ExperimentHubData package for the 'mulea' comprehensive overrepresentation and functional enrichment analyser R package. Here we provide ontologies (gene sets) in a data.frame for 27 different organisms, ranging from Escherichia coli to human, all acquired from publicly available data sources. Each ontology is provided with multiple gene and protein identifiers.
License: MIT + file LICENSE
URL: https://github.com/ELTEbioinformatics/muleaData
BugReports: https://support.bioconductor.org/tag/muleaData
biocViews: ExperimentData, ExperimentHub, Arabidopsis_thaliana_Data, Bacillus_subtilis_Data, Caenorhabditis_elegans_Data, Danio_rerio_Data, Drosophila_melanogaster_Data, Escherichia_coli_Data, Homo_sapiens_Data, Pan_troglodytes_Data, Pseudomonas_aeruginosa_Data, Rattus_norvegicus_Data, Saccharomyces_cerevisiae_Data, Staphylococcus_aureus_Data, ChIPSeqData, DNASeqData, ExpressionData, miRNAData
BiocType: ExperimentData
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Suggests: 
    knitr,
    rmarkdown
VignetteBuilder: knitr
bioc-issue-bot commented 4 months ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 4 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): muleaData_0.99.0.tar.gz macOS 12.7.1 Monterey: muleaData_0.99.0.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/muleaData to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 4 months ago

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

bioc-issue-bot commented 4 months ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 4 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): muleaData_0.99.0.tar.gz macOS 12.7.1 Monterey: muleaData_0.99.0.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/muleaData to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lshep commented 4 months ago

An in-depth review will occur once the package builds without ERROR. please let me know if you have any questions or concerns.

bioc-issue-bot commented 4 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: d12801adc616c541f331803d85b9e5858b97d810

bioc-issue-bot commented 4 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: muleaData_0.99.1.tar.gz Linux (Ubuntu 22.04.3 LTS): muleaData_0.99.1.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/muleaData to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

barizona commented 4 months ago

Dear @lshep and @bioc-issue-bot, @stitam has fixed the issue and I pushed the new version to git@git.bioconductor.org:packages/muleaData. These changes were made:

  1. There was a man file for muleaData() but no function with that name so the man file was deleted.
  2. LICENCE.md is unnecessary and may be confusing. Licence is already set in the DESCRIPTION file. So the LICENCE.md was deleted.
  3. Started tracking .Rbuildignore by git and added README.Rmd since the .Rmd version should not be part of the build.
  4. I made a minor change in the README

I used the following git commands:

git remote add upstream git@git.bioconductor.org:packages/muleaData.git
git fetch --all
git checkout main
git merge origin/main
git checkout devel
git merge main
git push upstream devel
git checkout main

Sincerely, Eszter

barizona commented 4 months ago

Dear @lshep and @bioc-issue-bot,

I have added a NEWS file and an empty test/testmake-data.R file, and change the spaces in the vignette to let BiocCheck::BiocCheck('new-package'=TRUE) run without other NOTES than the

I rebuilt the package and pushed the changes to git@git.bioconductor.org:packages/muleaData.git.

Best, Eszter

lshep commented 4 months ago

test

man

vignette

general

barizona commented 4 months ago

Dear @lshep ,

Thank you for showing these problems. I solved all:

test

I have removed the test directory with the empty file.

man

I have added R/muleaData.R that converted to man/muleaData.Rmd and now the ?muleaData works fine.

vignette

I have added mueaData.Rmd to the vignette folder that contains executed code and an abstract (the same as in the README).

general

We will submit the referred mulea package to Bioconductor within a week. So it is an existing package but not submitted to bioconductor yet.

Best, Eszter

lshep commented 3 months ago

Thank you. One last comment. Please remove the hard coded version in the BiocManager call. It will work with 3.19 and any future but that will change an installation for a user to use strictly 3.19 and never a future more updated version and ask them to downgrade a current version to the lower version. You should assume the user is on a correct version of Bioc/R to run your code and simply use

{r 'install', eval=FALSE}
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("ExperimentHub")
BiocManager::install("muleaData")
barizona commented 3 months ago

Dear @lshep , Thank you for the suggestion. I have changed it in the README and the vignette. Best, Eszter

bioc-issue-bot commented 3 months ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

lshep commented 3 months ago

The default branch of your GitHub repository has been added to Bioconductor's git repository as branch devel.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/barizona.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("muleaData"). The package 'landing page' will be created at

https://bioconductor.org/packages/muleaData

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.