mlopscommunity / knowledge-base Knowledge Base
MIT License
6 stars 0 forks source link

Write “Data versioning” page #6

Open mariyadavydova opened 3 years ago

mariyadavydova commented 3 years ago


dpbrinkm commented 3 years ago

here i think it will be interesting to look at all the different ways you can do version control like using meta data or immutable pipelines or taking snapshots of the data.

another thing we can talk about in this is why it is important and maybe some war stories from people when they didnt version data.

can also touch on regulations in certain sectors and also how the EU is planning to crack down on this a bit. charles talks about it in the #2 meetup we did.

BioGeek commented 3 years ago

Two tools that should certainly be mentioned are:

dpbrinkm commented 3 years ago

Yes and I think there are a few more like

cubanacci, maiot, allegro <- open source and I'm sure I'm forgetting some.

gmartinsribeiro commented 3 years ago

Yes and I think there are a few more like

cubanacci, maiot, allegro <- open source and I'm sure I'm forgetting some.

These are more into MLOps or automation, not really data versioning. I'm aware of DVC and Pachyderm too.