iterative / dvc.org

📖 DVC website and documentation
https://dvc.org
Apache License 2.0
334 stars 392 forks source link

guide: checkpoints for Tensorflow #2509

Closed iesahin closed 1 year ago

iesahin commented 3 years ago

I wrote something similar to iterative/dvclive#69 for all ways of using checkpoints in get-started-checkpoints I don't know if TF/Keras callbacks can be distributed with DVC or should we submit them to TF.contrib.

We need to be clear about the use of checkpoints and caveats in the docs. A special Checkpoints for Tensorflow document is useful to tell these and share the callbacks.

Originally posted by @iesahin in https://github.com/iterative/example-repos-dev/issues/47#issuecomment-849517197

jorgeorpinel commented 3 years ago

May be better as a blog post?

iesahin commented 3 years ago

I think checkpoints' usage in different ML libraries deserves a UG chapter. We need guides for Tensorflow, xgboost, PyTorch and non-Python usage. These are more difficult to wrap around than, say, configurations for different cloud providers.

jorgeorpinel commented 3 years ago

Agree about non-Python usage in general.

Idk about the specific libs. It may imply maintaining docs about 3rd party tools that may change at any point in time. It's already a risk we have with some guides e.g. https://neptune.ai/blog/best-7-data-version-control-tools-that-improve-your-workflow-with-machine-learning-projects or even https://dvc.org/doc/cml/start-github but those to some extent are harder to avoid since the integrations are built into DVC/CML.

That said, a single guide which just mentions several ML libs and with very very simple code samples (that will hardly break with changes in those libs) and/or links to their docs, that I could see.

dberenbaum commented 3 years ago

One reason dvclive is a separate library is so that we can have dependencies on ML frameworks there without weighing down the core dvc library. There are a couple of related dvclive issues: https://github.com/iterative/dvclive/issues/5 and https://github.com/iterative/dvclive/issues/70.

iesahin commented 3 years ago

I think, even if we don't bundle these integrations to dvclive due to maintenance purposes, there should be pages about how to use dvc(live) with tf/keras/pytorch/xgboost/R/Caffe... for search engine bots to find these words close enough and update their embeddings :)

I think most of the implementations are straightforward. As discussed in iterative/dvclive#5, they can be left to the user but we need to provide documentation for users to write their own.

dberenbaum commented 3 years ago

cc @pared

daavoo commented 3 years ago

I think, even if we don't bundle these integrations to dvclive due to maintenance purposes, there should be pages about how to use dvc(live) with tf/keras/pytorch/xgboost/R/Caffe... for search engine bots to find these words close enough and update their embeddings :)

I think most of the implementations are straightforward. As discussed in iterative/dvclive#5, they can be left to the user but we need to provide documentation for users to write their own.

Related: https://github.com/iterative/dvc.org/issues/2552

iesahin commented 3 years ago

I think the URL is incorrect, I got a 404 @daavoo :)

daavoo commented 3 years ago

I think the URL is incorrect, I got a 404 @daavoo :)

Indeed, the issue has been transferred:

https://github.com/iterative/dvclive/issues/87