flow_view_deps to work with paths

moodymudskipper commented 1 year ago

It already works with different functions, so that shouldn't be that hard.

The use case is I'm working on a collection of functions in a package, and I'd like to know what they depend on and how they depend between each other.

Maybe it should work only in a package and call devtools::load_all() to make sure all is synced, then we just check for function definitions in the scripts.

llrs commented 1 year ago

I am very interested in this functionality to learn the different dependencies of functions within a package. I thought it was already possible with flow but that I hadn't managed to make it to work.

Maybe it is helpful the old approach I have for this: mvbutils::foodweb. In the examples it uses find.funs("package:mvbutils") to find functions or directly or asNamespace( "mvbutils"). I think both approached work for loaded packages in development.

moodymudskipper commented 1 year ago

You mean for a whole package ? This might be a feature too, in that case I would allow a namespace or a "package:pkg" notation, as an first argument or component of the first arg, these would be close equivalent and the same in many case since we'll usually have only exported functions on the left, but they'll behave differently if we have unexported functions not called by exported functions, or if we demote some functions.

Meanwhile here is a hack around it :

library(flow)
all_funs <- as.list(asNamespace("unglue"))
names(all_funs) <- paste0("unglue:::", names(all_funs))
flow_view_deps(all_funs)


library(unglue)
all_funs <- as.list(as.environment("package:unglue"))
names(all_funs) <- paste0("unglue:::", names(all_funs))
flow_view_deps(all_funs)

^{Created on 2023-03-16 with reprex v2.0.2}

This is indeed useful because in the first diagram this helps me to see I have dead code in the package (unconnected function substr2<-).

llrs commented 1 year ago

foodweb works for the whole package and works well for this purpose. But I use it more in line with your initial post: to see if some functions share the same "dependency" or similar steps (or that they don't and they should). BTW, thanks for the package! I think it is very cool to explore package code and improve it.

moodymudskipper commented 1 year ago

Consider input from #88

Rmd files should also be considered, we can use knitr::purl() to extract the R code.

I think it might make sense to create a virtual package just for the locally defined functions, we can call it {local}. We can actually create it for real and remove it afterwards, so we can really just use the code we already have.

This is unlikely that the user of this feature want to see the full dependency diagram on all the dplyr functions they use in their script so restricting to the local virtual package and then the user might promote other functions.

Let's implement flow_view_package() first, which is just flow_view_deps() on all functions as shown above (maybe just on all exported functions, that's the same by default if there is no dead code but it makes it easier to demote branches)

Then let's have flow_view_deps() create a {local} package if fed a vector of R/Rmd scripts, and use flow_view_package() on those.

Doesn't seem too hard.

moodymudskipper commented 1 year ago

I think whenever several packages are shown on a diagram the function names should be prefixed => pkg::fun

Based on this we could go one step further for this feature. each script is a namespace (named as the basename of the file) with an associated "imports" parent with all functions from attached packages, and all functions from other scripts. We show by default functions for all scripts but it's easy to demote a script, since they're just like packages for flow at that point.

The namespaces/scripts themselves could be optionally on the diagram as well (new arg to flow_view_deps), as sharp angled rectangles with a new colorr, and pointing to their functions. But first things first.

idinov commented 1 year ago

Just to raise the issue with expanding "flow" to allow dynamic UML diagram construction for entire end-to-end computational protocols stored in electronic notebooks (Rmd), complete R packages, and interlinked R scripts. Also consider using SVG graphics to allow resolution independence and scalability, as these UML diagrams can get really complex (here are a few examples, http://socr.ucla.edu/docs/SOCR_PackageLevel.jpg).

(Antoine - would it make sense to post here the meta-flow outline you drafted/shared earlier?)

Thanks much.

moodymudskipper commented 1 year ago

No, no need for it here, it was not really doing the right thing and I think my strategy above will work better

moodymudskipper / flow

flow_view_deps to work with paths #138