schuemie commented 4 years ago

Tried to make function naming more consistent. Deleted some functions that didn't adhere to new design. Re-implemented some functions using newer functions. Making sure all functions return tibbles instead of data frames. Fixed some functions that weren't working.

gowthamrao commented 4 years ago

Hi @schuemie I tested the package.

To help with testing I made changes to TestCode.R file. During testing, I found a few issues (unrelated to this PR) - i have fixed some, added others to backlog.

Recommendation: I think results object should be returned as data-frame where possible. One of the changes you made (removing .flattenTree - returns a complex nested list of results. This is difficult to handle compared to data frame.) I think we should rework this part to incorporate a solution like .flattenTree that take such a complex nest and returns a more user friendly table format as described here

codecov-io commented 4 years ago

Codecov Report

Merging #95 into develop will increase coverage by 0.34%. The diff coverage is 8.86%.

@@            Coverage Diff             @@
##           develop     #95      +/-   ##
==========================================
+ Coverage     4.37%   4.72%   +0.34%     
==========================================
  Files            8       8              
  Lines          914     826      -88     
==========================================
- Hits            40      39       -1     
+ Misses         874     787      -87

Impacted Files	Coverage Δ
R/CohortCharacterization.R	`0.00% <0.00%> (ø)`
R/CohortDefinition.R	`0.00% <0.00%> (ø)`
R/Concept.R	`0.00% <0.00%> (ø)`
R/ConceptSet.R	`0.00% <0.00%> (ø)`
R/Estimation.R	`0.00% <0.00%> (ø)`
R/IncidenceRate.R	`0.00% <ø> (ø)`
R/PersonProfile.R	`0.00% <ø> (ø)`
R/WebApi.R	`25.82% <87.50%> (-0.49%)`	:arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3e49791...e651676. Read the comment docs.

schuemie commented 4 years ago

Thanks @gowthamrao !

Like I said: the flattenTree code was just not working at all, throwing an error message. I tried to get it running, but couldn't figure it out.

I'm still on the fence on what to do with the objects pulled from WebAPI. In mind I make a distinction between 'assets' and 'status information':

An asset is something created in ATLAS, like a cohort definition, an estimation definition, etc., that has a specified structure. That structure is represented inside of WebAPI as a Java object. To transmit it, it is represented as a JSON object. In R we represent that same asset as an R object, still following the originally specified structure.
Status information is basically meta-information, like the status of a run, or the list of assets available.

The assets are things we often want to preserve, and keep in the same structure, so we can convert them back from R to JSON back to the original Java objects. Therefore, we shouldn't apply all sorts of transformations to these objects, like flatten a tree structure. We could add a function that we can point at such an object to produce a pretty representation, but we'd have to keep the original without modification.

Because status information is transient by definition, there's no need to preserve as is, and it can be transformed to some pretty format (preferably a tibble) right away.

The reasons we don't want to maintain both the JSON and R representation of the same asset are:

Bad to have duplication.
How could I make changes to the object in R? Suppose I want to fetch a cohort definition, programmatically make some modificaction, and then upload it to WebAPI again? I don't want to manually edit the JSON string, but instead I'd want to modify the R object, and convert that to JSON.

schuemie commented 4 years ago

I guess the results objects are a bit in-between. Maybe they should just be converted to nicely formatted R objects, since we would want to upload results back to WebAPI anyway.

gowthamrao commented 4 years ago

An asset is something created in ATLAS, like a cohort definition, an estimation definition, etc., that has a specified structure. That structure is represented inside of WebAPI as a Java object. To transmit it, it is represented as a JSON object. In R we represent that same asset as an R object, still following the originally specified structure.

I like this description a lot @schuemie . I don't think results of an execution fit either of these categories. Example of results: get counts of persons who met each rule in cohort definition.

results are not Assets. The results are sourced by the WebApi from the underlying database that is storing the results. I don't think results are 'created in Atlas`.
I don't think results are Status either - because it is not transient meta information. results are persisted in db.

Status information is basically meta-information, like the status of a run, or the list of assets available.

I think results belong to their own category, and are the end product of the work. We could use assets and 'status' to make sense of the results. Since WebApi is just a transport medium from underlying db to end-user for results - we don't need to preserve the structure of the results. We also don't need to post results into WebApi. So - i don't think we

want to preserve, and keep in the same structure, so we can convert them back from R to JSON back to the original Java objects.

I propose, we deliver all results in table format (i.e. dataframe). For results we only maintain the R representation (of-coures, the db will have the source of truth and we wont modify those). Thus for results

we don't want to maintain both the JSON and R representation

OHDSI / ROhdsiWebApi

Making naming more consistent. Refactoring #95

Codecov Report