dotmesh-io / dotscience-python

Python library for Dotscience workloads
Apache License 2.0
7 stars 2 forks source link

split ds.start() functionality into ds.start_run() and ds.start_timing() #24

Open lukemarsden opened 5 years ago

lukemarsden commented 5 years ago

Currently the behaviour of ds.start() is somewhat conflated, at the expense of usability. It does several things, including:

  1. Starting the clock on timing a run
  2. Clearing state from any previous runs

However, often in interactive notebook usage, you want to do these things at different times:

What's your use case? Is it that you want to use ds.start() and ds.publish() to capture timing info (of a run in a single notebook cell, rather than manual tinkering time)?

Exactly, I wanted to record the hyperparameters in a first moment (e.g. in a class constructor) , and then capture the training time with start() and publish() when the fit() method is called

So let's make ds.start_timing() separate from ds.start_run(). We could alias ds.start() to ds.start_run() for backward compatibility, but start recommending using the explicit (separate) versions.

We should also make ds.start_run() clearer (in docs, at least) that it will wipe previous state, and should be run before adding any labels or metadata!

nickballdotscience commented 5 years ago

Also consider leaving ds.start name as-is and calling ds.start_timing ds.timer_start instead. Then the names are more distinct & easier to see the timer, which may be more temporary.