radivot / SEERaBomb

This R package contains codes that setup SEER and A-bomb survivor data use with R.
GNU General Public License v2.0
14 stars 2 forks source link

Error using mkSEER function #4

Closed aqdaisat closed 4 years ago

aqdaisat commented 4 years ago

When I run mkSEER function, I get: Error in cut.default(race, labels = c("White", "Black", "Other"), breaks = c(1, : 'x' must be numeric In addition: Warning message: Unknown or uninitialised column: 'reg'.

I'm using the newest 1975-2016 SEER Research Data (November 2018 Submission), would that be the problem?

radivot commented 4 years ago

do you have the latest SEERaBomb code from GitHub? does your data include chemo and radiation?

aqdaisat commented 4 years ago

Thank you for your prompt response. Yes, i got the latest SEERaBomb from GitHub and the data i have includes the newest treatment data (chemo and radio)

radivot commented 4 years ago

If you used something other than this code chunk, please share it and the output with me

library(SEERaBomb) #loads installed package SEERaBomb into memory (df=getFields()) #gets SEER fields into a data frame (rdf=pickFields(df))#picks a subset of SEER fields and defines their types mkSEER(rdf) #makes merged data file ~/data/SEER/mrgd/cancDef.Rdata

On my machine this chunk works fine, giving me

R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin15.6.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

library(SEERaBomb) #loads installed package SEERaBomb into memory Loading required package: dplyr

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Loading required package: ggplot2 Loading required package: rgl Loading required package: demography Loading required package: forecast Registered S3 method overwritten by 'xts': method from as.zoo.xts zoo Registered S3 method overwritten by 'quantmod': method from as.zoo.data.frame zoo Registered S3 methods overwritten by 'forecast': method from
fitted.fracdiff fracdiff residuals.fracdiff fracdiff This is demography 1.22

(df=getFields()) #gets SEER fields into a data frame start width sasnames names desc 1 1 8 PUBCSNUM casenum Patient ID 2 9 10 REG reg SEER registry 3 19 1 MAR_STAT marstat Marital status at diagnosis 4 20 2 RACE1V race Race/ethnicity 5 23 1 NHIADE nhiade NHIA Derived Hisp Origin 6 24 1 SEX sex Sex 7 25 3 AGE_DX agedx Age at diagnosis 8 28 4 YR_BRTH yrbrth Year of birth 9 35 2 SEQ_NUM seqnum Sequence number 10 37 2 MDXRECMP modx Month of diagnosis 11 39 4 YEAR_DX yrdx Year of diagnosis 12 43 4 PRIMSITE primsite Primary site ICD-O-2 (1973+) 13 47 1 LATERAL lateral Laterality 14 48 4 HISTO2V histo2 Histologic Type ICD-O-2 15 52 1 BEHO2V beho2 Behavior Code ICD-O-2 16 53 4 HISTO3V histo3 Histologic Type ICD-O-3 17 57 1 BEHO3V beho3 Behavior code ICD-O-3 18 58 1 GRADE grade Grade 19 59 1 DX_CONF dxconf Diagnostic confirmation 20 60 1 REPT_SRC reptsrc Type of reporting source 21 61 3 EOD10_SZ eod10sz EOD 10 - size (1988+) 22 64 2 EOD10_EX eod10ex EOD 10 - extension 23 66 2 EOD10_PE eod10pe EOD 10 - path extension 24 68 1 EOD10_ND eod10nd EOD 10 - lymph node 25 69 2 EOD10_PN eod10pn EOD 10 - positive lymph nodes examined 26 71 2 EOD10_NE eod10ne EOD 10 - number of lymph nodes examined 27 73 13 EOD13 eod13 EOD--old 13 digit 28 86 2 EOD2 eod2 EOD--old 2 digit 29 88 4 EOD4 eod4 EOD--old 4 digit 30 92 1 EOD_CODE eodcode Coding system for EOD 31 93 1 TUMOR_1V tumor1 Tumor marker 1 32 94 1 TUMOR_2V tumor2 Tumor marker 2 33 95 1 TUMOR_3V tumor3 Tumor marker 3 34 96 3 CSTUMSIZ cstumsiz Collaborative Stage (CS) Tumor Size 35 99 3 CSEXTEN csexten CS Extension 36 102 3 CSLYMPHN cslymphn CS Lymph Nodes 37 105 2 CSMETSDX csmetsdx CS Mets at DX 38 107 3 CS1SITE cs1site CS Site-Specific Factor 1 39 110 3 CS2SITE cs2site CS Site-Specific Factor 2 40 113 3 CS3SITE cs3site CS Site-Specific Factor 3 41 116 3 CS4SITE cs4site CS Site-Specific Factor 4 42 119 3 CS5SITE cs5site CS Site-Specific Factor 5 43 122 3 CS6SITE cs6site CS Site-Specific Factor 6 44 125 3 CS25SITE cs25site CS Site-Specific Factor 25 45 128 2 DAJCCT dajcct Derived AJCC T 46 130 2 DAJCCN dajccn Derived AJCC N 47 132 2 DAJCCM dajccm Derived AJCC M 48 134 2 DAJCCSTG dajccstg Derived AJCC Stage Group 49 136 1 DSS1977S dss1977s Derived SS1977 50 137 1 SCSSM2KO scssm2ko SEER Combined Summary Stage 2000 (2004+) 51 138 1 DAJCCFL dajccfl Derived AJCC - flag 52 141 6 CSVFIRST csvfirst CS Version Input Original 53 147 6 CSVLATES csvlates CS Version Derived 54 153 6 CSVCURRENT csvcurrent CS Version Input Current 55 159 2 SURGPRIF surgprif RX Summ--surg prim site 56 161 1 SURGSCOF surgscof RX Summ--scope reg LN sur 2003+ 57 162 1 SURGSITF surgsitf RX Summ--surg oth reg/dis 58 163 2 NUMNODES numnodes Number of lymph nodes 59 166 1 NO_SURG nosurg Reason no cancer-directed surgery 60 170 2 SS_SURG sssurg Site specific surgery (1983-1997) 61 174 1 SURGSCOP surgscop Scope of lymph node surgery 98-02 62 175 1 SURGSITE surgsite Surgery to other sites 63 176 2 REC_NO recno Record number 64 191 1 TYPE_FU typefu Type of followup expected 65 192 2 AGE_1REC agerec Age recode <1 year olds 66 199 5 SITERWHO siterwho Site recode ICD-O-3/WHO 2008 67 204 4 ICDOTO9V ICD9 Recode ICD-O-2 to 9 68 208 4 ICDOT10V ICD10 Recode ICD-O-2 to 10 69 218 3 ICCC3WHO iccc3who ICCC site recode ICD-O-3/WHO 2008 70 221 3 ICCC3XWHO iccc3xwho ICCC site rec extended ICD-O-3/ WHO 2008 71 224 1 BEHTREND behtrend Behavior recode for analysis 72 226 2 HISTREC histrec Broad Histology recode 73 228 2 HISTRECB histrecb Brain recode 74 230 3 CS0204SCHEMA cs0204schema CS Schema v0204 75 233 1 RAC_RECA racreca Race recode A 76 234 1 RAC_RECY racrecy Race recode Y 77 235 1 ORIGRECB origrecb Origin Recode NHIA 78 236 1 HST_STGA hststga SEER historic stage A 79 237 2 AJCC_STG ajccstg AJCC stage 3rd edition (1988+) 80 239 2 AJ_3SEER aj3seer SEER modified AJCC stage 3rd ed (1988+) 81 241 1 SSS77VZ sss77vz SEER Summary Stage 1977 (1995-2000) 82 242 1 SSSM2KPZ sssm2kpz SEER Summary Stage 2000 2000 (2001-2003) 83 245 1 FIRSTPRM firstprm First malignant primary indicator 84 246 5 ST_CNTY stcnty State-county recode 85 255 5 CODPUB COD Cause of death to SEER site recode 86 260 5 CODPUBKM codpubkm COD to site rec KM 87 265 1 STAT_REC statrec Vital status recode (study cutoff used) 88 266 1 IHSLINK ihslink IHS link 89 267 1 SUMM2K summ2k Historic SSG 2000 Stage 90 268 2 AYASITERWHO ayasiterwho AYA site recode/WHO 2008 91 270 2 LYMSUBRWHO lymsubrwho Lymphoma subtype recode/WHO 2008 92 272 1 VSRTSADX vsrtsadx SEER cause of death classification 93 273 1 ODTHCLASS odthclass SEER other cause of death classification 94 274 1 CSTSEVAL cstseval CS EXT/Size Eval 95 275 1 CSRGEVAL csrgeval CS Nodes Eval 96 276 1 CSMTEVAL csmteval CS Mets Eval 97 277 1 INTPRIM intprim Primary by International Rules 98 278 1 ERSTATUS erstatus ER Status Recode Breast Cancer (1990+) 99 279 1 PRSTATUS prstatus PR Status Recode Breast Cancer (1990+) 100 280 2 CSSCHEMA csschema CS Schema - AJCC 6th Edition 101 282 3 CS8SITE cs8site Cs Site-specific Factor 8 102 285 3 CS10SITE cs10site CS Site-Specific Factor 10 103 288 3 CS11SITE cs11site CS Site-Specific Factor 11 104 291 3 CS13SITE cs13site CS Site-Specific Factor 13 105 294 3 CS15SITE cs15site CS Site-Specific Factor 15 106 297 3 CS16SITE cs16site CS Site-Specific Factor 16 107 300 1 VASINV vasin Lymph-vascular Invasion (2004+) 108 301 4 SRV_TIME_MON surv Survival months 109 305 1 SRV_TIME_MON_FLAG srvtimemonflag Survival months flag 110 311 1 INSREC_PUB insrecpub Insurance Recode (2007+) 111 312 3 DAJCC7T dajcc7t Derived AJCC T 7th ed 112 315 3 DAJCC7N dajcc7n Derived AJCC N 7th ed 113 318 3 DAJCC7M dajcc7m Derived AJCC M 7th ed 114 321 3 DAJCC7STG dajcc7stg Derived AJCC 7 Stage Group 115 324 2 ADJTM_6VALUE adjtm6value Adjusted AJCC 6th T (1988+) 116 326 2 ADJNM_6VALUE adjnm6value Adjusted AJCC 6th N (1988+) 117 328 2 ADJM_6VALUE adjm6value Adjusted AJCC 6th M (1988+) 118 330 2 ADJAJCCSTG adjajccstg Adjusted AJCC 6th Stage (1988+) 119 332 3 CS7SITE cs7site CS Site-Specific Factor 7 120 335 3 CS9SITE cs9site CS Site-specific Factor 9 121 338 3 CS12SITE cs12site CS Site-Specific Factor 12 122 341 1 HER2 her2 Derived HER2 Recode (2010+) 123 342 1 BRST_SUB brstsub Breast Subtype (2010+) 124 348 1 ANNARBOR annarbor Lymphoma - Ann Arbor Stage (1983+) 125 349 1 SCMETSDXB_PUB scmetsdxbpub SEER Combined Mets at DX-bone (2010+) 126 350 1 SCMETSDXBR_PUB scmetsdxbrpub SEER Combined Mets at DX-brain (2010+) 127 351 1 SCMETSDXLIV_PUB scmetsdxlivpub SEER Combined Mets at DX-liver (2010+) 128 352 1 SCMETSDXLUNG_PUB scmetsdxlungpub SEER Combined Mets at DX-lung (2010+) 129 353 2 T_VALUE tvalue T value - based on AJCC 3rd (1988-2003) 130 355 2 N_VALUE nvalue N value - based on AJCC 3rd (1988-2003) 131 357 2 M_VALUE mvalue M value - based on AJCC 3rd (1988-2003) 132 359 2 MALIGCOUNT maligcount Total number of in situ/malignant tumors for patient 133 361 2 BENBORDCOUNT benbordcount Total number of benign/borderline tumors for patient 134 364 3 TUMSIZS tumsizs Tumor Size Summary (2016+) 135 367 5 DSRPSG dsrpsg Derived SEER Cmb Stg Grp (2016+) 136 372 5 DASRCT dasrct Derived SEER Combined T (2016+) 137 377 5 DASRCN dasrcn Derived SEER Combined N (2016+) 138 382 5 DASRCM dasrcm Derived SEER Combined M (2016+) 139 387 1 DASRCTS dasrcts Derived SEER Combined T Src (2016+) 140 388 1 DASRCNS dasrcns Derived SEER Combined N Src (2016+) 141 389 1 DASRCMS dasrcms Derived SEER Combined M Src (2016+) 142 390 2 TNMEDNUM tnmednum TNM Edition Number (2016+) 143 392 1 METSDXLN metsdxln Mets at DX-Distant LN (2016+) 144 393 1 METSDXO metsdxo Mets at DX-Other (2016+) 145 394 1 RADIATNR radiatn Radiation Recode 146 395 1 RAD_BRNR radbrnr Radiation to Brain or CNS Recode (1988-1997) 147 396 1 RAD_SURG radsurg Radiation sequence with surgery 148 397 1 CHEMO_RX_REC chemo Chemotherapy recode (yes, no/unk) (rdf=pickFields(df))#picks a subset of SEER fields and defines their types start width sasnames names desc type casenum 1 8 PUBCSNUM casenum Patient ID integer reg 9 10 REG reg SEER registry integer 1 19 1 string race 20 2 RACE1V race Race/ethnicity integer 11 22 2 string sex 24 1 SEX sex Sex integer agedx 25 3 AGE_DX agedx Age at diagnosis integer yrbrth 28 4 YR_BRTH yrbrth Year of birth integer 12 32 3 string seqnum 35 2 SEQ_NUM seqnum Sequence number integer modx 37 2 MDXRECMP modx Month of diagnosis integer yrdx 39 4 YEAR_DX yrdx Year of diagnosis integer 13 43 10 string histo3 53 4 HISTO3V histo3 Histologic Type ICD-O-3 integer 14 57 147 string ICD9 204 4 ICDOTO9V ICD9 Recode ICD-O-2 to 9 integer 15 208 47 string COD 255 5 CODPUB COD Cause of death to SEER site recode integer 16 260 41 string surv 301 4 SRV_TIME_MON surv Survival months integer 17 305 89 string radiatn 394 1 RADIATNR radiatn Radiation Recode integer 18 395 2 string chemo 397 1 CHEMO_RX_REC chemo Chemotherapy recode (yes, no/unk) integer mkSEER(rdf) #makes merged data file ~/data/SEER/mrgd/cancDef.Rdata Making population file data.tables Removing SEER 9 person years from: /Users/radivot/data/SEER/populations/expanded.race.by.hispanic/yr1992_2016.seer9.plus.sj_lx_rg_ak before pooling into one file. The population files of SEER were processed in 11.968 seconds. Cancer Data: The following fields will be written: [1] "casenum" "reg" "race" "sex" "agedx" "yrbrth" "seqnum" "modx" "yrdx" "histo3" "ICD9"
[12] "COD" "surv" "radiatn" "chemo"
[1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/BREAST.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/DIGOTHR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/MALEGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/FEMGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/OTHER.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/RESPIR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/COLRECT.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/LYMYLEUK.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/URINARY.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/BREAST.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/DIGOTHR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/MALEGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/FEMGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/OTHER.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/RESPIR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/COLRECT.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/LYMYLEUK.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/URINARY.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/BREAST.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/DIGOTHR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/MALEGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/FEMGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/OTHER.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/RESPIR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/COLRECT.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/LYMYLEUK.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/URINARY.TXT" [1] "using bind_rows() to make DF canc" Cancer files were processed in 50.72 seconds. [1] "save()-ing DF canc to disk" Cancer data has been written to: /Users/radivot/data/SEER/mrgd/cancDef.RData size mtime /Users/radivot/data/SEER/mrgd/cancDef.RData 133 M 2019-12-07 09:12:38 /Users/radivot/data/SEER/mrgd/cancRS.RData 134 M 2019-10-12 21:03:08 /Users/radivot/data/SEER/mrgd/cancStgs.RData 175 M 2019-07-08 19:28:30 /Users/radivot/data/SEER/mrgd/cancSurg.RData 138 M 2019-04-16 19:54:28 /Users/radivot/data/SEER/mrgd/cancTNBC.RData 133 M 2019-10-27 09:11:44 /Users/radivot/data/SEER/mrgd/popsa.RData 1 M 2019-12-07 09:12:38 /Users/radivot/data/SEER/mrgd/popsae.RData 1 M 2019-12-07 09:12:39

aqdaisat commented 4 years ago

Good morning. I just figured out what is the problem. It turns to have this issue because of the home directory. it should be as you recommended " /Users/username/Documents." and it wont work if it was not and gives that error. Now everything is working smoothly. Thank you so much for helping with this.

radivot commented 4 years ago

Great, thanks for figuring it out.