Bioconductor / Contributions

Contribute Packages to Bioconductor
131 stars 33 forks source link

(inactive) muleaData ExperimentHubData Bioconductor package #3290

Closed barizona closed 4 months ago

barizona commented 4 months ago

Dear BioC Team,

Here we provide the muleaData ExperimentHubData Bioconductor package for the mulea (ELTEbioinformatics/mulea on github) R package. mulea is a comprehensive overrepresentation and functional enrichment analyser R package which reads ontologies (gene and protein sets) in a standardised GMT (Gene Matrix Transposed) format. We provide these GMT files for 27 different model organisms, ranging from Escherichia coli to human, all acquired from publicly available data sources. The GMT files are provided with multiple gene and protein identifiers such as UniProt protein IDs, Entrez, Gene Symbol, and Ensembl gene IDs. The GMT files and the scripts we applied to create them are available at the _GMT_files_formulea (ELTEbioinformatics/GMT_files_for_mulea on github) repository. For the muleaData we read these GMT files with the mulea::read_gmt() function and saved it to .rds files with the standard R saveRDS() function. I've got the SAS token from Lori Shepard and uploaded the .rds files to the Azure Data Lake. She wrote that the data is now added:

> query(eh, "muleaData")
ExperimentHub with 879 records
# snapshotDate(): 2024-02-07
# $dataprovider: muleaData
# $species: Drosophila melanogaster, Homo sapiens, Mus musculus, Caenorhabdi...
# $rdataclass: data.frame
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["EH8571"]]' 

           title                                                              
  EH8571 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_EnsemblID.rds
  EH8572 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_EntrezID.rds 
  EH8573 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_GeneSymbol...
  EH8574 | Genomic_location_Ensembl_Arabidopsis_thaliana_10genes_UniprotID.rds
  EH8575 | Genomic_location_Ensembl_Arabidopsis_thaliana_20genes_EnsemblID.rds
  ...      ...                                                                
  EH9445 | Genomic_location_Ensembl_Zea_mays_5genes_UniprotID.rds             
  EH9446 | Protein_domain_PFAM_Zea_mays_EnsemblID.rds                         
  EH9447 | Protein_domain_PFAM_Zea_mays_EntrezID.rds                          
  EH9448 | Protein_domain_PFAM_Zea_mays_GeneSymbol.rds                        
  EH9449 | Protein_domain_PFAM_Zea_mays_UniprotID.rds         

Best wishes, Eszter Ari

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 4 months ago

Dear @barizona ,

I found more than one GitHub URL in your issue. Please make sure there is only one, it should look like:

https://github.com/username/reponame

I am closing this issue. Please try again with a new issue.