CancerRegistryOfNorway / nordcansurvival

Other
0 stars 0 forks source link

nordcanstat_survival() call to survival_statistics() argument infile #32

Closed BjarteAAGNES closed 3 years ago

BjarteAAGNES commented 3 years ago

To be solved by Huidong and Bjarte 03.03 13.00:

Intro:

In _nordcanstatsurvival() the function _survivalstatistics() is called two times: one for 5-year period aggregation, and one for 10-year period aggregation. The infile argument to survival_statistics() SHOULD be explicit set to the files:

survival_file_analysis.dta_5 survival_file_analysis.dta_10

Further details:

The file names above COULD be changed to:

survival_file_analysis_5.dta survival_file_analysis_10.dta

The args cancer_record_dataset_path SHOULD be named infile ref Stata code template https://github.com/CancerRegistryOfNorway/nordcansurvival/blob/master/R/survival_statistics.R line 97

https://github.com/CancerRegistryOfNorway/nordcansurvival/blob/master/R/nordcansurvival.R

lines: 119 ...

# **10-year periods**
survival_statistics(
      stata_exe_path =  settings[["stata_exe_path"]],
      **cancer_record_dataset_path = settings[["survival_file_analysis_path"]],**
      national_population_life_table_path = settings[["national_population_life_table_path"]],
      outfile = "survival_statistics_period_**10**_dataset",
      estimand = "netsurvival",
      by = c("entity", "sex", "period_10"),
      standstrata = "agegroup_ICSS_3",
      iweight = "weights_ICSS_3"
  )

https://github.com/CancerRegistryOfNorway/nordcansurvival/blob/master/R/survival_statistics.R

lines 67 ...

survival_statistics <- function(
      stata_exe_path = NULL,
      **cancer_record_dataset_path,**
      national_population_life_table_path,
      outfile = "survival_statistics",
      estimand = "netsurvival",
      by = c("entity", "sex", "period_5"),
      standstrata = "agegroup_ICSS_5",
      iweight = "weights_ICSS_5"
) 

Compare to "template definition"

survival_statistics ,     /// Stata cmd defined in survival_statistics.ado
        **infile(\"%s\")                /// NC S dataset (dta)**
        outfile(\"%s\")             /// detailed ressults (dta)
        lifetable(\"%s\")         /// National lifetable file (dta)
        estimand(%s)          /// What to estimate
        country(\"%s\")       ///
        by(%s)                 ///
      standstrata(%s)     ///  
      iweight(%s) 

Thus the "hidden" definition in utils() SHOULD be visible and explicit in the above files/functions thus some corrections MUST be done also to https://github.com/CancerRegistryOfNorway/nordcansurvival/blob/master/R/utils.R where survival_file_analysis_path now is set in line 127

survival_file_analysis_path <- normalize_path(
    paste0(survival_work_dir, "/survival_file_analysis.dta")
  )

example Stata code from template:

5-year periods using survival_file_analysis_5.dta

survival_statistics ,                                                        ///
  infile(survival_file_analysis_5.dta)                                      ///
  outfile(outfile_survival_file_analysis_5.dta)                             ///
  lifetable(./inst/Stata/demo/NO_2018_lifetable.dta)                         ///
  estimand("netsurvival")                                                    ///
  country(NORWAY)                                                            ///
  by(entity sex period_5)                                                   ///
  standstrata(agegroup_ICSS_5)                                               ///
  iweight(weights_ICSS_5)

10-year periods using survival_file_analysis_10.dta

survival_statistics ,                                                        ///
  infile(survival_file_analysis_10.dta)                                      ///
  outfile(outfile_survival_file_analysis_10.dta)                             ///
  lifetable(./inst/Stata/demo/NO_2018_lifetable.dta)                         ///
  estimand("netsurvival")                                                    ///
  country(NORWAY)                                                            ///
  by(entity sex period_10)                                                   ///
  standstrata(agegroup_ICSS_3)                                               ///
  iweight(weights_ICSS_3)
BjarteAAGNES commented 3 years ago

The nordcansurvival() first argument cancer_record_dataset is the name of the data.table containing the NORDCAN dataset of cancer records after pre-processing. This should not be changed.

A test could be to load an copy of an existing 9.0 environment containing all objects and functions, then replace the changed functions, then run the survival parts following the instructions for users.

I know this will take time… but then the changes are few and should be easy to get right. If you want we can examine the changes together before this test, on Thursday.

BjarteAAGNES commented 3 years ago

@HuidongTian I think this issue can be closed by you with reference to the commit solving the issue.