rticulate / import

An Import Mechanism For R
https://import.rticulate.org
Other
222 stars 14 forks source link

Making the import package work with the drake/targets package #49

Closed telegott closed 2 years ago

telegott commented 3 years ago

I've recently discovered both the drake/targets and the import package which I see as vital additions to the R ecosystem, and it would be great to make them work together.

If I'm not mistaken, drake relies on all user scripts being loaded by source. drake can detect any changes made to the functions used and invalidate the targets, so that on a subsequent run these targets will be newly calculated.

Using the import package with drake poses the problem that changes to module scripts imported with import (or recursive imports of such module scripts) won't be detected, so drake would still see the same.

Ideally for a middle sized analysis I would have some helper/constants files and module scripts where each function imported from a module scripts would take care of a target. This would lead to very nice isolation, testability and reproducibility. I'm aware that the recommended approach is to use a package structure, but so far I haven't seen a solution that seems satisfying. A single package that has very different functionalities seems to get out of hand quickly along with limitations like not having subfolders in R/ and so on, while having many small packages for a middle sized task poses other structuring issues in a single repo like having different renvs, R projects and so on which I do not want in a case where everything happens in the same context and only there.

Do you see a possibility to expose something that would notify drake that it needs to invalidate affected targets? I already asked the drake creator about this, but he suggested that this would need to be solved on the side of import.

Thank you all for your great work!

torfason commented 3 years ago

Just to be clear, my understanding is that you are looking for something like the following.

Current suggested drake pattern:

source("file_with_create_plot_function.R")
plan <- drake_plan(
  data = readxl::read_excel(file_in("raw_data.xlsx")),
  hist = create_plot(data)
)

Your desired pattern:

import::from(file_with_create_plot_function, create_plot)
plan <- drake_plan(
  data = readxl::read_excel(file_in("raw_data.xlsx")),
  hist = create_plot(data)
)

Does the above not work? In that changing the create_plot function within file_with_create_plot_function.R does not trigger the hist target to be run? I am not familiar with how drake detects changes to the functions it uses, but one thought is that perhaps it only looks for changes to functions in the current environment. If so, perhaps switching to import::here() would solve the problem?

If this does not address the problem, a fix would probably need input from someone very familiar with drake. We (the maintainers) would not be averse to a PR implementing this in a clean manner, but it is unlikely that we would undertake implementing it ourselves, given that we are not drake experts.

telegott commented 3 years ago

Thanks for the answer, that's exactly what I meant yes! While import::here could work, recursive imports should be still hidden, no?

torfason commented 3 years ago

Yes, the idea would be that recursive imports (those referened in import::here statements in the file_with_create_plot_function.R file would be hidden from the main namespace. Let us know if this is not the case. If you get this working, and come up with a clean MRE, feel free to contribute a short discussion to the vignette (advanced usage). Either as a PR or with the text+MRE submitted as issues.

torfason commented 2 years ago

I've checked out the targets (knew drake) from a few years back. It seems that the functionality is too complex for me to be willing to take up the responsibility of interoperating with that. So while a contribution to the advanced usage vignette would be appreciated, in the absence of that I expect we'll close this as wontfix at the next release.

torfason commented 2 years ago

This issue has been relatively quiet for over a year now, and the implementation seems complicated. So I'm closing as wontfix. I would however be happy to entertain specific implementation ideas (or even better, contributions of an implementation), assuming they look reasonably clean and do not introduce any backwards incompatibilities.