bio-tools / biotoolsRegistry

biotoolsregistry : discovery portal for bioinformatics
GNU General Public License v3.0
70 stars 20 forks source link

R/CRAN/BioC content import documentation and policy ? #454

Open sneumann opened 5 years ago

sneumann commented 5 years ago

Hi, is there documentation how R package maintainers can improve the information about their packages ? I am not talking about logging in and editing the information, but rather whether there are processes used to scrape and update information about R packages. Is there a mapping between the BiocViews and EDAM ? Is there a way for us to provide an additional XML/JSON/whatever file within the package with more content that gets slurped in by bio.tools ? Yours, Steffen

veitveit commented 5 years ago

Hi Steffen, I can only answer with respect to Bioconductor. We used a script (https://github.com/bio-tools/biotoolsConnect/tree/master/BioConductor) to load all information from BiocViews and added manual curation to EDAM (via csv) afterwards. And the script also will need to be updated to deal with the new biotoolsschema and changed to output JSON. Mapping the terms in Bioconductor to EDAM was out of scope but would be supergreat to achieve automatic information exchange between Bioconductor and bio.tools. One could actually take the current bio.tools annotations of packages and check how consistently they match the different BiocViews.

sneumann commented 5 years ago

Ok, just a few notes and links I found on this: that tool is reading a mapping "EDAM Mappings - BioConductor Version 1.csv" that I could not locate: https://github.com/bio-tools/biotoolsConnect/blob/4acfbee763e99948bff41d0c96f687efafff5806/BioConductor/Bioconductor2Bio_tools.R#L204 The biocViews are here https://github.com/Bioconductor/biocViews/tree/master/data or through library(biocViews); data(biocViewsVocab); biocViewsVocab and as graph and on a per-package basis at http://www.bioconductor.org/packages/release/bioc/VIEWS Yours, Steffen

veitveit commented 5 years ago

Thanks for pointing that out. I added the missing file. Again, please be aware that all this is quite outdated and needs to be checked and changed to the new biotoolsSchema.

sneumann commented 4 years ago

BioC people like @mtmorgan and team are interested to look into bio.tools compatible markup of the BioC packages. Yours, Steffen

joncison commented 4 years ago

I'm just pinging @ jvanheld and @hmenager here - as this is a nice example of a community which could (and most definitely should!) be supported. I mean the communities include not just national (e.g. IFB), scientific (e.g. rare disease, metabolomics etc.) but also technical, as the case here. The requirement for these guys is very similar (understanding the information requirement, mapping existing information fields etc., but additionally what's needed is sustainable data sharing / import / synchronisation mechanisms...

The plan @sneumann @mtmorgan is to move over to a GitHub-based content management architecture for global sharing and re-use of tool descriptions, rather than being solely reliant on the bio.tools API. That could be the best option for BioC. I'm hoping early next year to be in position where I can assist with all this stuff. Cheers !

joncison commented 4 years ago

PS @sneumann as for "how R package maintainers can improve the information about their package" there are at least comprehensive curation guidelines which describe the tool attributes (defined by biotoolsSchema thus supported by bio.tools and how to specify them. This a start at least.

rioualen commented 3 weeks ago

Hi all,

This very relevant issue has finally turned into an exciting project, which will kick off at the next ELIXIR BioHackathon Europe! 🎉

Please check out our project page, and do not hesitate to reach out if you're interested in knowing more 🙌🏻