RMI-PACTA / workflow.data.preparation

The goal of `workflow.data.preparation` is to prepare all of the necessary data inputs for the Transition Monitor web application.
Other
2 stars 0 forks source link

Migrate `write_manifest` from `pacta.data.preparation` #174

Open jdhoffa opened 8 months ago

jdhoffa commented 8 months ago
          NIT: I believe I have mentioned this before, but I tend to think the whole `write_manifest` function should live in `workflow.data.preparation` in general. 

It is very much related to all the file I/O specifics, tracking versions/ file hashes, etc., and not particularly relevant to the whole "typical functions that read and write data".

Just a thought, but not a hill I am going to die on 😂

_Originally posted by @jdhoffa in https://github.com/RMI-PACTA/pacta.data.preparation/pull/342#discussion_r1502689838_

cjyetman commented 7 months ago

@AlexAxthelm recently in mentioned (in Teams?) the idea creating a new R package that centralizes/generalizes the writing of a manifest file. That's kinda the opposite of what's being proposed here @jdhoffa?

jdhoffa commented 7 months ago

@AlexAxthelm recently in mentioned (in Teams?) the idea creating a new R package that centralizes/generalizes the writing of a manifest file. That's kinda the opposite of what's being proposed here @jdhoffa?

I would say alternative, not necessarily opposite. My main idea when writing this issue was that pacta.data.preparation should remain focused on reading/ writing/ manipulating data, and not focus on what the workflow.* repos are doing.

One way to do that, would be to migrate write_manifest and things like that directly into the relevant workflow.* repo. Another way of achieving it is developing an R package that exclusively worries about it, e.g. pacta.workflow.utils

I tend to also prefer the pacta.worfklow.utils approach as it ensures the code is DRY, while also keeping each package focused on it's main purpose.

cjyetman commented 7 months ago

Still haven't quite wrapped my head around distinguishing R packages by not dealing with file i/o versus context