Closed Sweetdevil144 closed 2 months ago
What you're suggesting already exists: https://github.com/PecanProject/pecan/blob/develop/docker/depends/pecan_package_dependencies.csv
What you're suggesting already exists:
Thanks. I've been aware of the scripts/generate_dependencies.R file responsible for generating this script. But what I proposed was addition of a script that lists all PEcAN Packages and respective functions utilised internally by other PEcAn Packages. Although, now I realise that this would just be a subset of the original generate_dependencies.R
script. Thanks for Correction. Also, below are links to my .R
script and generated .csv
file for a review:
https://github.com/Sweetdevil144/module-dependencies/blob/main/pecan_dependencies.csv
https://github.com/Sweetdevil144/module-dependencies/blob/main/find_package_utilizations.R
Another point that I wanted to add was that my custom pecan_dependencies.csv also provides details on What functions are Utilised from our Imported Packages making it easy for us to determine our Process of Optimiation of Packages. Although a lot more Optimization in my .csv
may be needed (for example : removal of common imports like PEcAn.logger
which are being utilised for logging. Another removal may be related to PEcAn.db
)
Being able to see which functions are called from which package does sound like a useful feature, though I have to say I’m much more often looking for all the functions a package calls from one particular dependency than I am in all functions from all its dependencies. If this can support that use case while providing an improvement in ergonomics over my default grep pkgname -R dirname
, it could become a tool I reached for regularly.
A few other limitations I see in the current implementation:
PEcAn
followed by a literal dot, but
some packages like PEcAnAssimSequential do not have a dot in their name. tests/
directory (which often does contain unique dependencies) but not, say, inst/
(which I sometimes do and sometimes don’t care about — in many packages we use it to store outdated versions of scripts to be updated later). ::
, which is by far the most common but there are legitimate cases where we import functions into the package and call them without namespaces instead. Overall I doubt I’d use it in its current form, but if it helps you don’t let me stop you from using it! If you want to spent more time on it as a learning tool, I recommend thinking through how it could find all the functions from one arbitrary package.
A higher-level comment: Knowing what functions we use from where is a great strategy for debugging and for planning refactoring, but I’m less sure it’s necessary to automate it the way this issue proposes. The times I’d use this script would be manual invocations while aaking a focused question like “Ugh, [dependency] is causing installation problems, which functionality in [package] do we import it for? What would break if I remove it?” That’s usually easier to answer by searching for [dependency] on the fly than by looking it up in a big list of all the called functions.
Description
Is your feature request related to a problem? Please describe. Identifying which PEcAn sub-packages are used across our project is currently manual, error-prone, and inefficient. This issue becomes significant especially given the goals of the GSoC project "Optimize PEcAn for freestanding use of single packages."
Proposed Solution
Describe the solution you'd like I've developed a script that automatically scans R scripts in the PEcAn project, identifies usage of PEcAn sub-packages, and outputs a CSV file listing dependencies. This solution simplifies tracking dependencies, aiding in optimization and modularization efforts.
Alternatives Considered
Describe alternatives you've considered
Additional Context