Closed Aeilert closed 6 months ago
Slack differs quite significantly from Benchmarking on some of the datasets I have available. Using the Electric_plants
data frame from deaR, slack differs with the following parameters:
Inputs: Labor, Fuel, Capital Outpus: Output RTS: Variable Orientation: Input
Some firms differ with RTS set to NIRS, and they differ quite significantly with NDRS.
Interestingly, the pear package gives the same results as Benchmarking package, so the difference lies somewhere in the implementation in pioneeR.
Here is a reproducible example. The compute_slack()
function in pioneeR (and in pear) will give the same results as Benchmarking, but the implementation within the app itself is flawed. It is possible to use model arguments in the slack function that differs from the original DEA model. This is what happens in the app, as the default values for CRS and input orientation will always be selected.
df <- deaR::Electric_plants
X <- cbind(df$Labor, df$Fuel, df$Capital)
Y <- as.matrix(df$Output)
vrs_pioneer <- pioneeR::compute_efficiency(X, Y, type = 'vrs', orientation = 'in', digits = NULL)
vrs_pear <- pear::compute_efficiency(X, Y, type = 'vrs', orientation = 'in', digits = NULL)
vrs_benchmarking <- Benchmarking::dea(X, Y, RTS = 'vrs', ORIENTATION = 'in')
slack_pioneer <- pioneeR::compute_slack(X, Y, vrs_pioneer$unadj_values, type = 'vrs', orientation = 'in', digits = NULL)
slack_pear <- pear::compute_slack(X, Y, vrs_pear$unadj_values, type = 'vrs', orientation = 'in', digits = NULL)
slack_benchmarking <- Benchmarking::slack(X, Y, vrs_benchmarking)
# Same results in pioneeR, pear and Benchmarking
slack_pioneer$values
slack_pear$values
slack_benchmarking$sum
Reproduce incorrect results in the current PR:
slack_pear2 <- pear::compute_slack(X, Y, vrs_pear$unadj_values, type = 'crs', orientation = 'in', digits = NULL)
# Gives wrong results as returns to scale is CRS
slack_pear2$values
@ohjakobsen : The specific bug in the UI should now be fixed.
@ohjakobsen : The specific bug in the UI should now be fixed.
This fixes the bug, but the implementation is still problematic, and these are exported functions that will be available to the user. I will do a full review and request changes to the implementation.
@ohjakobsen : The specific bug in the UI should now be fixed.
This fixes the bug, but the implementation is still problematic, and these are exported functions that will be available to the user. I will do a full review and request changes to the implementation.
Sounds good. I agree there is no explicit need to export these functions. I think the simplest solution is to use @keywords internal
in documentation, but other options and code changes could also be valid.
Several changes should be done to this PR:
- Reduce the number of exported functions for now. The user needs a function that returns the DEA model, slack and super efficiency (non exported functions should have
@noRd
)- The
compute_efficiency()
function should return an S3 object that has a custom print method (with adigits
argument). This object can be passed to further functionsinst/app/R/fct_dea_boostrap.R
must be updated as well (and change the rather embarrassing typo in the filename)- The code includes a lot of commented code that should be removed before we merge
- Adhere to style guide for functions, in particular on the use of return
Hi @ohjakobsen,
Thanks for the review. As discussed offline, I have added several improvements and changes to this PR. The nature of the suggested changes required a more comprehensive approach. The good news is that the end-result is quite a lot better. :)
Important changes include:
I have added a new (exported) compute_dea()
function. This is based on the one from {pear}
, but includes several important changes. It now includes boolean parameters that can be used to specify whether or not to include super eff., scale eff., slack or peers in the computations/output. It also has a new object structure that is hopefully more suitable for a model object (it is sort of a compromise between a "data.frame" and a "model" object). I have added an s3-method for the response to use for printing and model summary of eff. scores.
See below for suggestions on how this can be used in the report.
compute_efficiency()
, compute_super_efficiency()
, compute_slack()
and get_peers()
have been converted to internal functions (with @noRd
-tags). As mentioned these were written as internal functions and I always intended them to be so. The reason they where exported in the first commit was to follow the structure of create_matrix()
and compute_scale_efficiency()
.
The digits
parameter in compute_efficiency()
, compute_super_efficiency()
and compute_slack()
has been removed.
compute_efficiency()
, compute_super_efficiency()
and compute_slack()
now returns an info
object as part of their response. This includes information on the model orientation, type and dimensions, and can easily be modified to include additional information if needed. I do however not think it is a good idea to include the model inputs (e.g. x
, y
). I also don't think there is any need to create a custom s3-object for this, since these are internal functions.
compute_efficiency()
has gained a new argument; values_only = TRUE|FALSE
. This can be used in application code when only the efficiency scores are needed as inputs to other functions. The typical use case here is when efficiency scores are needed to calculate productivity (Malmquist), scale efficiency or bootstrapped values. I don't think the name values_only
is genius, so other suggestions are welcome. I'm also not sure about the overall need for this parameter, but large(r) return objects can in rare cases reduce performance (proved in testing), and this is an easy way around that issue.
compute_efficiency()
and compute_super_efficiency()
no longer have default arguments for type
and orientation
. This will make it harder for developers to accidentally introduce bugs like then one you found for slack.
compute_slack()
no longer takes values
, type
and orientation
as input parameters. Instead it relies on the result from compute_efficiency()
, supplied as a model
parameter. This is fairly similar to the approach in {Benchmarking}
.
get_peers()
has been modified to rely on the same model
parameter, and not directly on the lambda
values.
Unit tests for compute_efficiency()
, compute_super_efficiency()
and compute_slack()
have been rewritten to not rely on "on-the-fly" calculations from {Benchmarking}
. Instead the necessary values are pre-computed and stored as part of /testdata
. This further reduces our reliance on {Benchmarking}
, and makes it easier to remove this as a soft-dependency in the future (i.e. remove it from Suggests). The script to create the testing data is includes under /tests/scripts
. This folder has been added to .Rbuildignore
.
I have expanded the unit tests to included tests for two additional datasets (hospitals
from {rDEA}
and Electric_plants
from {deaR}
) to further ensure that our implementation of {Benchmarking}
is correct. This means we have tests for three well-known and published DEA datasets, as well as our own district courts dataset.
I have added deprecation warnings to compute_matrix()
and compute_scale_efficiency()
. The first should be removed , and the latter should be re-written to align with either . EDIT: Created a new issue for this instead (see #86). compute_efficiency()
(as an internal function) or compute_dea()
(as an exported function) in the future
I have added fixes for a few bugs where user supplied rounding was not reflected in the UI (i.e in server.R). See #84. This is out of scope for the main issue here, but in general I think it is good practice to fix "broken windows" when spotted.
Note also some dependency changes: Move {cli}
to Imports as this is used in some of the utility functions. Add {lifecycle}
to Imports for deprecation handling. Add {withr}
to Suggests.
TO DO:
compute_dea()
needs its own set of unit tests, but let's maybe agree on the functionality and structure first. Suggestion to the report:
# Estimate the DEA model
model <- compute_dea(
df,
id = idvar,
input = inputs
output = outputs,
type = rts,
orientation = orientation,
super = TRUE,
slack = TRUE,
)
# Print out the summary of the DEA model
summary_tbl_dea(model)
# Summary statistics for the efficiency scores
summary(model$results$efficiency)
# Get table for scale efficiency
compute_scale_efficiency(inputs, outputs, orientation, digits = 4L)
An additional comment on dep. warnings for old functions. I think it is correct to add a dep. warning for create_matrix()
since it basically is no longer in use at all (when also removed from the report). But when running pioneeR::run_pioneer()
this might look weird.
Its just every 8 hours though, so I think we can live with this
An additional comment on dep. warnings for old functions. I think it is correct to add a dep. warning for
create_matrix()
since it basically is no longer in use at all (when also removed from the report). But when runningpioneeR::run_pioneer()
this might look weird.Its just every 8 hours though, so I think we can live with this
Actually, we should rethink this. create_matrix()
was only included as a helper function for the 0.3.0. We should not confuse the user with irrelevant warnings that the user cannot do anything about. We should either postpone the deprecation, or make sure that the warning is not shown when the user runs run_pioneer()
.
Edit: I will remove the deprecation. It cannot be deprecated before we remove it from server.R
, and it should not be removed from there yet. This PR already does too much, so will fix it later on.
Updated this PR with the following changes:
type
to rts
for all dea and lp functionsThanks @ohjakobsen
I reviewed the recent changes. They look good!
Add custom functions for efficiency, super efficiency, slack and peer calcuations. This removes Benchmarking as a hard dependency.
The commit implies some minor changes in the slack table shown in the Analyse tab, but should otherwise not effect the UI.
Includes:
Closes #80