GitDataAI / jzfs

A Git-like Version Control File System for Datasets Management in the Era of AI.
https://gitdata.ai
Other
75 stars 5 forks source link
aiops data-collaboration data-lake data-lineage data-product data-version-control data-versioning dataops digital-twins federated-learning git git-filesystem git-for-data git-interface jiaozifs jzfs mlops version-controlled-filesystem

JZFS

Golang implementation of JZFS: version control file system for datasets management in the era of AI.



JZFS is an industry-leading Data-Centric Version Control File System, helps ensure Responsible AI Engineering by improving Data Versioning, Provenance, and Reproducibility.

Note:

Data-centric AI is about the practice of iterating and collaborating on data, used to build AI systems, programmatically. Machine learning pioneer Andrew Ng argues that focusing on the quality of data fueling AI systems will help unlock its full power.


Features

In production systems with machine learning components, updates and experiments are frequent. New updates to models(data products) may be released every day or every few minutes, and different users may see the results of different models as part of A/B experiments or canary releases.


Getting Started

Requirement

  1. To build JZFS, you need a working installation of Go 1.22.0 or higher
  2. JZFS use postgres to store running data, you can install at postgres install installation guide

Build And Running

  1. clone and build
    git clone https://github.com/GitDataAI/jzfs.git
    cd jzfs
    make build

After following the above steps, you should be able to see an executable file named "jzfs."

  1. init program and running
    ./jzfs init  --db postgres://<username>:<password>@localhost:5432/jiaozifs?sslmode=disable
    ./jzfs daemon

run with docker

docker run -v <data>:/app -p 34913:34913 gitdatateam/jzfs:latest  --db "postgres://<user>:<password>@192.168.1.16:5432/jiaozifs?sslmode=disable" --bs_path /app/data --listen http://0.0.0.0:34913 --config /app/config.toml

Cloud

Try without installing

Note: storage config for IPFS backend storage as you create a new repository in JZFS Console.

 {"type":"ipfs","ipfs":{"url":"/dns/kubo-service.ipfs.svc.cluster.local/tcp/5001"}}

Examples

Build AL/ML pipeline over JZFS
Face detection and recognition inference pipeline


Documentation

Official Documentation


Users and Partners

Lighthouse Permanent Storage
MesoReef DAO: Decentralized Science for Regenerating
LunCo
Artizen Fund
HaAI Labs


Contributors


License

Dual-licensed under MIT + Apache 2.0