Closed devilgate closed 5 years ago
I propose to reduce risk we implement this in two stages:
That makes sense.
Tasks, stage one (one primary covariate, multiple additional covariates):
Tasks, stage two (multiple covariates):
I have now done the database changes (in alter_12.sql). Instead of creating a new table I added covariate_type to rif40_inv_covariates/t_rif40_inv_covariates. This has values of 'N' (for normal covariates - the default) or 'A' (for additional). This will hopefully remove the need for changes to the extract code. Tested as back compatible on both PostgreSQL and SQL Server - i.e. you can still run a study with covariates.
Nice work, Peter. Brandon has changed the R code so that it extracts multiple covariates, and I've changed the Java to pass multiple names to the R if they're there, so it's all coming together.
Just looking at the code again, and AbstractCovariate
has a covariateType
property. It's of type CovariateType
, which is an enum with the values CONTINUOUS_VARIABLE
, BINARY_INTEGER_SCORE
, and NTILE_INTEGER_SCORE
.
But it seems like that property has no corresponding value in the database. It's worked out at runtime, in CovariateManager
's getCovariates
method. It just depends on the maximum and minimum values.
Do we even need it? There doesn't seem to be much in the way of functionality that depends on it.
This comes from the covariate definitions in rif40_covariates, as opposed to covariate_type in rif40_inv_covariates which is what you are working with: TYPE of covariate (1=integer score/2=continuous variable). Min < max max/min precison is appropriate to type. Continuous variables are not currently supported. Integer scores can be a binary variable 0/1 or an NTILE e.g. 1..5 for a quintile.
So it can be removed. rif40_inv_covariates.covariate_type of 'N' must be an integer score until we support quantiles in the extract.
Submit, save and multiple/additional covariate selection working OK; data being transferred to middleware, which is only processing the first covariate:
"investigations": {"investigation": [{
"years_per_interval": 1,
"additionals": [{"additional_covariate": {
"covariate_type": "CONTINUOUS_VARIABLE",
"minimum_value": "358.0",
"name": "NEAR_DIST",
"description": "near distance covariate",
"maximum_value": "78787.0"
}}],
...
"covariates": [
{"adjustable_covariate": {
"covariate_type": "INTEGER_SCORE",
"minimum_value": "0.0",
"name": "AREATRI1KM",
"description": "area tri 1 km covariate",
"maximum_value": "1.0"
}},
{"adjustable_covariate": {
"covariate_type": "INTEGER_SCORE",
"minimum_value": "1.0",
"name": "SES",
"description": "socio-economic status",
"maximum_value": "5.0"
}}
],
Table data:
1> select * from rif40.rif40_inv_covariates where study_id = 199;
2> go
username study_id inv_id covariate_name covariate_type min max geography study_geolevel_name
------------------------------------------------------------------------------------------ ----------- ----------- ------------------------------ -------------- ----------- ----------- -------------------------------------------------- ------------------------------
peter 199 183 SES N 1.000 5.000 SAHSULAND SAHSU_GRD_LEVEL4
(1 rows affected)
This is now OK for merging. I have fixed:
"additionals": [{"additional_covariate": {
"covariate_type": "CONTINUOUS_VARIABLE",
"minimum_value": "358.0",
"name": "NEAR_DIST",
"description": "near distance covariate",
"maximum_value": "78787.0"
}}],
11:30:19.763 [http-nio-8080-exec-3] WARN org.geotools.map.FeatureLayer org.geotools.map: Bounds crs not defined; assuming bounds from schema are correct for CollectionFeatureSource:org.geotools.feature.DefaultFeatureCollection@8663bc7
Paul suggested either six or ten, as opposed to just the one we have at present (as well as age and sex). Though as soon as we go over one, why have a limit?
This will require at least the following: