insightsengineering / teal.data

Data model for teal applications
https://insightsengineering.github.io/teal.data/
Other
8 stars 7 forks source link

1088 allows `get_code_dependency` to detect usage of objects in functions on LHS #289

Closed m7pr closed 7 months ago

m7pr commented 7 months ago

Closes https://github.com/insightsengineering/teal/issues/1088

The issue that we tried to fix was the inability to extract lines such as fun(object) <- (object used in a function on lhs) in getting code for the object.

Below resulted in

code <- '
  data(iris)
  names(iris) <- letters[1:5]
'
tdata <- teal_data(code = code)
cat(get_code(tdata, datanames = 'iris'))
#> warning('Code was not verified for reproducibility.')
#> data(iris)

which now gives

code <- '
  data(iris)
  names(iris) <- letters[1:5]
'
tdata <- teal_data(code = code)
cat(get_code(tdata, datanames = 'iris'))
#> warning('Code was not verified for reproducibility.')
#> data(iris)
#> names(iris) <- letters[1:5]
github-actions[bot] commented 7 months ago

badge

Code Coverage Summary

Filename                         Stmts    Miss  Cover    Missing
-----------------------------  -------  ------  -------  --------------------
R/cdisc_data.R                       1       0  100.00%
R/deprecated.R                      57      57  0.00%    19-344
R/dummy_function.R                   2       2  0.00%    14-15
R/formatters_var_labels.R           36      11  69.44%   60, 69-80
R/join_key.R                        38       0  100.00%
R/join_keys-c.R                     12       0  100.00%
R/join_keys-extract.R              128       0  100.00%
R/join_keys-names.R                 15       0  100.00%
R/join_keys-parents.R               30       0  100.00%
R/join_keys-print.R                 45       0  100.00%
R/join_keys-utils.R                 73       3  95.89%   35-38
R/join_keys.R                       21       0  100.00%
R/teal_data-class.R                 25       1  96.00%   69
R/teal_data-datanames.R             10       0  100.00%
R/teal_data-get_code.R              14       0  100.00%
R/teal_data-show.R                   4       4  0.00%    14-19
R/teal_data.R                       30      16  46.67%   34, 37-43, 53-59, 62
R/testhat-helpers.R                 26       0  100.00%
R/topological_sort.R                32       0  100.00%
R/utils-get_code_dependency.R      184       1  99.46%   275
R/verify.R                          42      11  73.81%   63, 93-97, 100-104
TOTAL                              825     106  87.15%

Diff against main

Filename                         Stmts    Miss  Cover
-----------------------------  -------  ------  -------
R/utils-get_code_dependency.R      +14       0  +0.04%
TOTAL                              +14       0  +0.22%

Results for commit: fd4d841711c1592aa0e0c402491c34f485f7d06e

Minimum allowed coverage is 80%

:recycle: This comment has been updated with latest results

github-actions[bot] commented 7 months ago

Unit Test Performance Difference

Additional test case details | Test Suite | $Status$ | Time on `main` | $±Time$ | Test Case | |:-----|:----:|:----:|:----:|:-----| | get_code | 👶 | | $+0.02$ | does_not_break_if_object_is_used_in_a_function_on_lhs | | get_code | 👶 | | $+0.01$ | does_not_break_if_object_is_used_in_a_function_on_lhs_and_influencers_are_both_on_lhs_and_rhs |

Results for commit 5cea288169c707c126235113e2f5559dbb59c6ce

♻️ This comment has been updated with latest results.

github-actions[bot] commented 7 months ago

Unit Tests Summary

  1 files   14 suites   1s :stopwatch: 177 tests 175 :white_check_mark: 2 :zzz: 0 :x: 247 runs  245 :white_check_mark: 2 :zzz: 0 :x:

Results for commit fd4d8417.

:recycle: This comment has been updated with latest results.

m7pr commented 7 months ago

@chlebowa

 lapply(
    calls_pd,
    function(call_pd) {
      # Handle data(object)
[...]

Totally this part grown, piece by piece, with every new PR! At the beginning (when we did not cover assign()/data() functions) where we did not cover all edge cases this was handle-able :P but now I see this could be divided into smaller pieces, in another PR that focuses on cleaning this PR. https://github.com/insightsengineering/teal.data/issues/292

m7pr commented 7 months ago

Hey @chlebowa hey @donyunardi because of the timing (end of day in Poland) and the need for a release this fix on CRAN I will ask @donyunardi just to make soft test (results of this function, and testing apps where teal.data is installed from this branch). Once soft tests are fine, please merge @donyunardi and submit a release on CRAN.

For the implementation details I created a new card https://github.com/insightsengineering/teal.data/issues/292 where we will clean-up implementaion.

donyunardi commented 7 months ago

For ADSL, I can see the teal.data::col_labels(ADSL)[c(names(adsl_labels))] <- adsl_labels line but it's still missing the adsl_labels <- teal.data::col_labels(ADSL, fill = FALSE) line.

image

So I still get error: image

donyunardi commented 7 months ago

Here's a simpler code:

code <- '
  ADSL <- synthetic_cdisc_data("latest")$adsl
  adsl_labels <- teal.data::col_labels(ADSL, fill = FALSE)  
  teal.data::col_labels(ADSL)[c(names(adsl_labels))] <- adsl_labels
'
tdata <- teal_data(code = code)
cat(get_code(tdata, datanames = 'ADSL'))

image

donyunardi commented 7 months ago

Could the issue lies in graph_parser specifically this line? https://github.com/insightsengineering/teal.data/blob/035a65c9d70e56ce4b00e525f9e17c05e565e3ca/R/utils-get_code_dependency.R#L378-L379

I think this line is checking the left hand side of <- to see if there's any dataname (i.e. ADSL) being mentioned. I guess for our case, the ADSL is mentioned on the right hand side. adsl_labels <- teal.data::col_labels(ADSL, fill = FALSE)

m7pr commented 7 months ago

Thanks for checking @donyunardi . Will have one more look : )

m7pr commented 7 months ago

Hey @donyunardi can you review once more?

I tested on the limited code, but simplified to the below symbols :)

> code <- '
+   x <- 5
+   y <- length(x)  
+   names(x)[y] <- y
+ '
> tdata <- teal_data(code = code)
> cat(get_code(tdata, datanames = 'x'))
warning('Code was not verified for reproducibility.')
x <- 5
y <- length(x)
names(x)[y] <- y
donyunardi commented 7 months ago

@m7pr I'm going to merge this so I can redeploy the example apps dev branch in teal.gallery. Hope that's okay!

m7pr commented 7 months ago

@m7pr I'm going to merge this so I can redeploy the example apps dev branch in teal.gallery. Hope that's okay!

@donyunardi this is the exact motivation why we need a testing environment to test things on feature branches and not on main branch. github.com/insightsengineering/coredev-tasks/issues/505

donyunardi commented 7 months ago

I agree. Let's continue the discussion on this topic.