CancerRegistryOfNorway / nordcanpreprocessing

Other
0 stars 0 forks source link

Likely nordcan_processed_cancer_death_count_dataset bug #2

Closed WetRobot closed 3 years ago

WetRobot commented 3 years ago

Can't create the processed_cancer_death_count_dataset in

cancer_death_count_dataset <- nordcanpreprocessing::nordcan_processed_cancer_death_count_dataset(
  unprocessed_cancer_death_count_dataset
)

using Swedish unprocessed_death_count_dataset. But subsetting to icd_version == 10 works, so this is likely a bug in the code.

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2012 R2 x64 (build 9600)

Matrix products: default

locale:
[1] LC_COLLATE=Finnish_Finland.1252  LC_CTYPE=Finnish_Finland.1252    LC_MONETARY=Finnish_Finland.1252 LC_NUMERIC=C                     LC_TIME=Finnish_Finland.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] nordcanepistats_0.3.1      zoo_1.8-8                  remotes_2.2.0              popEpi_0.4.8               iarccrgtools_0.2.17        splines_4.0.2              etm_1.1.1                 
 [8] lattice_0.20-41            testthat_2.3.2             usethis_1.6.1              mgcv_1.8-33                survival_3.2-3             Epi_2.41                   rlang_0.4.7               
[15] pkgbuild_1.1.0             nordcancore_0.3.0          glue_1.4.2                 withr_2.2.0                DBI_1.1.0                  nordcanpreprocessing_0.3.0 sessioninfo_1.1.1         
[22] nordcansurvival_0.3.0      easyassertions_0.2.1       plyr_1.8.6                 stringr_1.4.0              dbc_0.2.26                 devtools_2.3.1             fcrnordcan_2.8.1.1        
[29] memoise_1.1.0              callr_3.4.4                ps_1.3.4                   parallel_4.0.2             fansi_0.4.1                Rcpp_1.0.5                 backports_1.1.9           
[36] desc_1.2.0                 fcrcd_2.8.1.1              pkgload_1.1.0              cmprsk_2.2-10              fs_1.5.0                   digest_0.6.25              stringi_1.5.3             
[43] processx_3.4.4             rprojroot_1.3-2            numDeriv_2016.8-1.1        grid_4.0.2                 fcrcore_0.3.6              RPostgreSQL_0.6-2          cli_2.0.2                 
[50] basicepistats_0.1.12       tools_4.0.2                magrittr_1.5               fcrdb_0.4.15               crayon_1.3.4               MASS_7.3-53                ellipsis_0.3.1            
[57] Matrix_1.2-18              fcrassert_0.3.1            data.table_1.13.0          prettyunits_1.1.1          assertthat_0.2.1           rstudioapi_0.11            R6_2.4.1                  
[64] nlme_3.1-149               compiler_4.0.2            
WetRobot commented 3 years ago

Caused by duplicate icd_version and icd_code combinations in nordcancore::nordcan_metadata_icd_by_version_to_entity(). The same code points to multiple entity numbers. Perhaps the table is not used correctly or some additional information is missing.

e.g.

  conversion_dt <- nordcancore::nordcan_metadata_icd_by_version_to_entity()
  conversion_dt[icd_code == "1400"]
CotterpinDoozer commented 3 years ago

This is probably an error in the looktup-table I have provided you with, so should probably be corrected there...? If yes, I will take a look at it.

the-r-man commented 3 years ago

I was looking at this for a short while now but could not find anything yet

WetRobot commented 3 years ago

resolved in https://github.com/CancerRegistryOfNorway/nordcancore/commit/43d2c860393fc431b14d7fdda723267683173df1