ohbm / hackathon2022

Website for the 2022 OHBM Hackathon
https://ohbm.github.io/hackathon2022/
MIT License
5 stars 11 forks source link

DataCat: "bring your own data" and generate user-friendly data catalogs #53

Open jsheunis opened 2 years ago

jsheunis commented 2 years ago

Title

DataCat: "bring your own data" and auto-generate user-friendly data catalogs

Short description and the goals for the OHBM BrainHack

Summary

Do you want to learn how to generate a pretty and F.A.I.R. browser-based data catalog from metadata? Do you want to know how you can make your data known to the world, without sharing the actual data content on centralised infrastructure? Do you want to do this for free using open-source tools? YES?! Then "bring" your own data and join our hackathon project!

Overview

DataLad Catalog is a free and open source command line tool, with a Python API, that assists with the automatic generation of user-friendly, browser-based data catalogs from structured metadata. It is an extension to DataLad, and together with DataLad Metalad it brings distributed metadata handling, catalog generation, and maintenance into the hands of users. For a live example of a catalog that was generated using DataLad Catalog, see our StudyForrest Demo. The tool is now ready to be tested (and hopefully broken and then fixed!) on a wider range of user data. This is therefore intended to be a "bring your own data" project. If you are interested in metadata handling of (distributed) datasets, and specifically in generating a live catalog from said metadata, join us for a chance to turn your (metadata)data into a pretty browser application!

Project Goals

Link to the Project

https://github.com/datalad/datalad-catalog

Image for the OHBM brainhack website

https://raw.githubusercontent.com/jsheunis/ohbm-2022/main/pics/datacat0_hero.svg

Project lead

Stephan Heunis

Main Hub

Glasgow

Other Hub covered by the leaders

Skills

We welcome all kinds of contributions from various skills at any level. From setting up and writing documentation, discussing relevant functionality, or user-experience-testing, to Python-based implementation of the desired functionality and creating real-world use cases and workflows.

You can help us with any of the following skills:

Recommended tutorials for new contributors

Good first issues

We will try to generate a constant flow of good-first-issues throughout the project. Some examples are:

Twitter summary

Do you want to publish your data openly, without sharing actual content on centralised infrastructure? Want to auto-generate a browser-based data catalog from metadata? YES?! Then "bring" your own data and join our project: DataCat! https://github.com/datalad/datalad-catalog

Short name for the Discord chat channel (~15 chars)

datacat

Please read and follow the OHBM Code of Conduct

likeajumprope commented 2 years ago

HI @jsheunis, great project :) Have you considered running the project in the cloud on a vm or in a jupyterbook? Happy to help you to set something up :)

jsheunis commented 2 years ago

hey @likeajumprope, thanks! I've already created an environment on Binder, which project members will use when they do the tutorial associated with the project. Other than that I haven't given a cloud environment much thought.

yarikoptic commented 2 years ago

I might join this project with my/your own data at https://github.com/ReproNim/ReproTube/

djarecka commented 2 years ago

Thank you for submitting the project! We have 35 projects right now, woohoo! But that means the projects pitches will have to be short. We will give you tomorrow 2 minutes to pitch your project, you can have one slide or no slides! If you decide to use a slide, please include the link to the slide here.

And don't worry, you will still have more time to talk about your project during the BrainHack :-)

jsheunis commented 2 years ago

Slides: https://jsheunis.github.io/ohbm-2022/talks/ohbm-2022-poster-jsheunis.html Demo: https://datalad.github.io/datalad-catalog/ Poster: https://github.com/jsheunis/ohbm-2022/blob/main/poster/ohbm-poster-2022-datacat-jsheunis.pdf 3min poster talk: https://www.youtube.com/watch?v=4GERwj49KFc Tutorial: https://github.com/datalad/tutorials/tree/master/notebooks/catalog_tutorials