Closed pared closed 1 year ago
I think it's easy enough for users to add integrations as needed (or for the dvc team to add them in response to demand), so it's probably not worthwhile to spend time adding more now.
How do we plan to handle dependencies for multiple frameworks? Each supported framework is pretty heavy, and I think it's unreasonable already to expect an XGBoost user to install Tensorflow to use dvclive. Similar concerns would apply for dvcx.
Thoughts @pared @dmpetrov ?
See #25 for more discussion of dependency management.
@dberenbaum I think leaving particular implementations for our users is a good idea, those are easy tasks. Writing tests might be harder, but I guess we can help users write them, instead of doing all the legwork, not even knowing whether particular integrations will be desired by userbase.
As to installation, you are right, we do it already in dvc
(for different backends) and we will have to go this way here too.
On second thought here, is it worthwhile to add sklearn integration? Since this is such a large framework, integration may be more complex, and if you have an opinion about how to implement it, probably better to add the integration now than wait for contributions. Even if it means implementing one particular model or class of models, it may be a worthwhile template. Thoughts?
Makes sense, I will get to that once I am done with supporting dvclive
outputs caching
sklearn is largely not focused on deep learning, which has been the primary use case for dvclive. Should other algorithms be supported? If the primary purpose is to track model training progress, it seems only useful where models are trained iteratively. I only know of a couple of classes of algorithms where this is true:
@dberenbaum Yes, after digging through documentation, it seems to me that in general, learning algorithms divide to those which utilize fit
method and both fit
and partial_fit
. It does not seem to me that we can provide integration for "only fit
" models, and in case of partial_fit
models, the workflow will probably look more like torch
one, which in my opinion does not require any integration, as its created manually.
The only place I could probably see some integration is methods accepting scoring
param which can be Callable
but it seems to me it would be really hard to define how such integration could work.
I am considering to work on the integration with pytorch-lightning
but I'm not sure about where to contribute the new logger (i.e. this repository or pytorch-lightning
itself). See https://github.com/iterative/dvclive/issues/70#issuecomment-811868255
@daavoo Thats a great news! Can we do something to help with that pull request?
@daavoo Thats a great news! Can we do something to help with that pull request?
It has been already approved so I think it will be merged soon, thanks!
I think it might be a good idea to have separated issues for each integration in order to better track the progress and have specific discussions for each one (i.e. this issue got "populated" by specific sklearn
discussions).
@daavoo That is right, in the beggining we intended it to be an umbrella issue, since singular implementations seemed like easy tasks. As sklerarn
example shows, we should probably track each integration separately.
For future reference:
Changing the name of the issue for sklearn
. Other integrations issues should be created as separate issues.
Reviving this as I think that skearn
should be the entry point for discussing what can dvclive
provide in "stepless" scenarios (no deep learning no gradient boosting) beyond https://github.com/iterative/dvclive/issues/182
Taking a quick look at our example repositories using sklearn (https://github.com/iterative/example-get-started), it looks that it would be a low-hanging fruit to add some utility to go from (y_true
, y_pred
) to PRC / ROC plots.
Given that example repo, we would be removing quite a few lines for users:
# Given labels, predictions
precision, recall, prc_thresholds = metrics.precision_recall_curve(labels, predictions)
fpr, tpr, roc_thresholds = metrics.roc_curve(labels, predictions)
# ROC has a drop_intermediate arg that reduces the number of points.
# https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html#sklearn.metrics.roc_curve.
# PRC lacks this arg, so we manually reduce to 1000 points as a rough estimate.
nth_point = math.ceil(len(prc_thresholds) / 1000)
prc_points = list(zip(precision, recall, prc_thresholds))[::nth_point]
with open(prc_file, "w") as fd:
json.dump(
{
"prc": [
{"precision": p, "recall": r, "threshold": t}
for p, r, t in prc_points
]
},
fd,
indent=4,
)
with open(roc_file, "w") as fd:
json.dump(
{
"roc": [
{"fpr": fp, "tpr": tp, "threshold": t}
for fp, tp, t in zip(fpr, tpr, roc_thresholds)
]
},
fd,
indent=4,
)
To:
from dvclive.sklearn import log_precision_recall_curve, log_roc_curve
log_precision_recall_curve(labels, predictions)
log_roc_curve(labels, predictions)
Seems we should be supporting at least few popular frameworks.
Considering their popularity, we should probably start with:
sklearnWorth considering:
TF
andPyTorch
- it seems to me that using their pure form is done when users need highly custom models, and probably in that cases they will be able to handledvclive
by hand. @dmpetrov did I miss some popular framework?EDIT: crossing out FastAi as it has its own issue now