Open skrawcz opened 3 months ago
I took one of the targets
examples and transferred it one-to-one to hamilton
to see how the concepts compare. Both workflows are implemented in modules and make use of helper functions from a separate module. Then both are started interactively from quarto documents and their results and graphs visualized in the rendered output of said notebooks: http://jmbuhr.de/targets-hamilton-comparison/ (source code: https://github.com/jmbuhr/targets-hamilton-comparison)
Again, this is for exploration of possibilities, not to impose paradigms on you :)
In this first pass I noticed two things I was missing in hamilton compared to targets when it comes to caching:
dr.execute
run as with tar_load(<name of node>)
(https://docs.ropensci.org/targets/reference/tar_load.html) is super helpful for interactively picking up where you left of with a workflow and working on different parts of it.For inspiration, the developer documentation of how targets
does caching might come in handy: https://books.ropensci.org/targets-design/data.html#skipping-up-to-date-targets
Update - we've got a candidate API:
c = CacheStore() # this could house various strategies, e.g. basic checkpointing, to more sophisticated fingerprinting.
dr = driver.Builder()...with_cache(c, **kwargs).build()
# first run -- nothing cached
dr.execute([output1], inputs=A)
# change some code -- any code: upstream or downstream of what was run before
# rebuild driver
dr = driver.Builder()...with_cache(c, **kwargs).build()
# this should recompute as needed -- and recompute downstream as needed
dr.execute([output2], inputs=A)
# no-op if run again
dr.execute([output2], inputs=A)
# should only recompute what inputs impact -- going downstream as needed.
dr.execute([output2], inputs=A')
Then there's some nuance around:
Updates:
.with_cache()
will just be about fingerprinting caching strategy.
.with_checkpointing()
will just be about checkpointing.
Is your feature request related to a problem? Please describe. We need a simpler caching & checkpointing story that has full service visibility into what's going on.
Describe the solution you'd like
These should come with the ability to:
Prior art
Could use https://books.ropensci.org/targets/walkthrough.html#change-code as inspiration.
Additional context Slack threads:
Next steps:
TODO: write up tasks in this issue into smaller and manageable chunks.