Error using mkSEER function #4

Closed aqdaisat closed 4 years ago

aqdaisat commented 4 years ago

When I run mkSEER function, I get: Error in cut.default(race, labels = c("White", "Black", "Other"), breaks = c(1, : 'x' must be numeric In addition: Warning message: Unknown or uninitialised column: 'reg'.

I'm using the newest 1975-2016 SEER Research Data (November 2018 Submission), would that be the problem?

radivot commented 4 years ago

do you have the latest SEERaBomb code from GitHub? does your data include chemo and radiation?

aqdaisat commented 4 years ago

Thank you for your prompt response. Yes, i got the latest SEERaBomb from GitHub and the data i have includes the newest treatment data (chemo and radio)

radivot commented 4 years ago

If you used something other than this code chunk, please share it and the output with me

library(SEERaBomb) #loads installed package SEERaBomb into memory
(df=getFields()) #gets SEER fields into a data frame
(rdf=pickFields(df))#picks a subset of SEER fields and defines their types
mkSEER(rdf) #makes merged data file ~/data/SEER/mrgd/cancDef.Rdata

On my machine this chunk works fine, giving me

fitted.fracdiff fracdiff residuals.fracdiff fracdiff This is demography 1.22

(df=getFields()) #gets SEER fields into a data frame
     start width   sasnames        names                                          desc
1        1     8   PUBCSNUM      casenum                                    Patient ID
2        9    10        REG          reg                                 SEER registry
[... additional fields truncated for brevity ...] path extension 24 68 1 EOD10_ND eod10nd EOD 10 - lymph node 25 69 2 EOD10_PN eod10pn EOD 10 - positive lymph nodes examined 26 71 2 EOD10_NE eod10ne EOD 10 - number of lymph nodes examined 27 73 13 EOD13 eod13 EOD--old 13 digit 28 86 2 EOD2 eod2 EOD--old 2 digit 29 88 4 EOD4 eod4 EOD--old 4 digit 30 92 1 EOD_CODE eodcode Coding system for EOD 31 93 1 TUMOR_1V tumor1 Tumor marker 1 32 94 1 TUMOR_2V tumor2 Tumor marker 2 33 95 1 TUMOR_3V tumor3 Tumor marker 3 34 96 3 CSTUMSIZ cstumsiz Collaborative Stage (CS) Tumor Size 35 99 3 CSEXTEN csexten CS Extension 36 102 3 CSLYMPHN cslymphn CS Lymph Nodes 37 105 2 CSMETSDX csmetsdx CS Mets at DX 38 107 3 CS1SITE cs1site CS Site-Specific Factor 1 39 110 3 CS2SITE cs2site CS Site-Specific Factor 2 40 113 3 CS3SITE cs3site CS Site-Specific Factor 3 41 116 3 CS4SITE cs4site CS Site-Specific Factor 4 42 119 3 CS5SITE cs5site CS Site-Specific Factor 5 43 122 3 CS6SITE cs6site CS Site-Specific Factor 6 44 125 3 CS25SITE cs25site CS Site-Specific Factor 25 45 128 2 DAJCCT dajcct Derived AJCC T 46 130 2 DAJCCN dajccn Derived AJCC N 47 132 2 DAJCCM dajccm Derived AJCC M 48 134 2 DAJCCSTG dajccstg Derived AJCC Stage Group 49 136 1 DSS1977S dss1977s Derived SS1977 50 137 1 SCSSM2KO scssm2ko SEER Combined Summary Stage 2000 (2004+) 51 138 1 DAJCCFL dajccfl Derived AJCC - flag 52 141 6 CSVFIRST csvfirst CS Version Input Original 53 147 6 CSVLATES csvlates CS Version Derived 54 153 6 CSVCURRENT csvcurrent CS Version Input Current 55 159 2 SURGPRIF surgprif RX Summ--surg prim site 56 161 1 SURGSCOF surgscof RX Summ--scope reg LN sur 2003+ 57 162 1 SURGSITF surgsitf RX Summ--surg oth reg/dis 58 163 2 NUMNODES numnodes Number of lymph nodes 59 166 1 NO_SURG nosurg Reason no cancer-directed surgery 60 170 2 SS_SURG sssurg Site specific surgery (1983-1997) 61 174 1 SURGSCOP surgscop Scope of lymph node surgery 98-02 62 175 1 SURGSITE surgsite Surgery to other sites 63 176 2 REC_NO recno Record number 64 191 1 TYPE_FU typefu Type of followup expected 65 192 2 AGE_1REC agerec Age recode <1 year olds 66 199 5 SITERWHO siterwho Site recode ICD-O-3/WHO 2008 67 204 4 ICDOTO9V ICD9 Recode ICD-O-2 to 9 68 208 4 ICDOT10V ICD10 Recode ICD-O-2 to 10 69 218 3 ICCC3WHO iccc3who ICCC site recode ICD-O-3/WHO 2008 70 221 3 ICCC3XWHO iccc3xwho ICCC site rec extended ICD-O-3/ WHO 2008 71 224 1 BEHTREND behtrend Behavior recode for analysis 72 226 2 HISTREC histrec Broad Histology recode 73 228 2 HISTRECB histrecb Brain recode 74 230 3 CS0204SCHEMA cs0204schema CS Schema v0204 75 233 1 RAC_RECA racreca Race recode A 76 234 1 RAC_RECY racrecy Race recode Y 77 235 1 ORIGRECB origrecb Origin Recode NHIA 78 236 1 HST_STGA hststga SEER historic stage A 79 237 2 AJCC_STG ajccstg AJCC stage 3rd edition (1988+) 80 239 2 AJ_3SEER aj3seer SEER modified AJCC stage 3rd ed (1988+) 81 241 1 SSS77VZ sss77vz SEER Summary Stage 1977 (1995-2000) 82 242 1 SSSM2KPZ sssm2kpz SEER Summary Stage 2000 2000 (2001-2003) 83 245 1 FIRSTPRM firstprm First malignant primary indicator 84 246 5 ST_CNTY stcnty State-county recode 85 255 5 CODPUB COD Cause of death to SEER site recode 86 260 5 CODPUBKM codpubkm COD to site rec KM 87 265 1 STAT_REC statrec Vital status recode (study cutoff used) 88 266 1 IHSLINK ihslink IHS link 89 267 1 SUMM2K summ2k Historic SSG 2000 Stage 90 268 2 AYASITERWHO ayasiterwho AYA site recode/WHO 2008 91 270 2 LYMSUBRWHO lymsubrwho Lymphoma subtype recode/WHO 2008 92 272 1 VSRTSADX vsrtsadx SEER cause of death classification 93 273 1 ODTHCLASS odthclass SEER other cause of death classification 94 274 1 CSTSEVAL cstseval CS EXT/Size Eval 95 275 1 CSRGEVAL csrgeval CS Nodes Eval 96 276 1 CSMTEVAL csmteval CS Mets Eval 97 277 1 INTPRIM intprim Primary by International Rules 98 278 1 ERSTATUS erstatus ER Status Recode Breast Cancer (1990+) 99 279 1 PRSTATUS prstatus PR Status Recode Breast Cancer (1990+) 100 280 2 CSSCHEMA csschema CS Schema - AJCC 6th Edition 101 282 3 CS8SITE cs8site Cs Site-specific Factor 8 102 285 3 CS10SITE cs10site CS Site-Specific Factor 10 103 288 3 CS11SITE cs11site CS Site-Specific Factor 11 104 291 3 CS13SITE cs13site CS Site-Specific Factor 13 105 294 3 CS15SITE cs15site CS Site-Specific Factor 15 106 297 3 CS16SITE cs16site CS Site-Specific Factor 16 107 300 1 VASINV vasin Lymph-vascular Invasion (2004+) 108 301 4 SRV_TIME_MON surv Survival months 109 305 1 SRV_TIME_MON_FLAG srvtimemonflag Survival months flag 110 311 1 INSREC_PUB insrecpub Insurance Recode (2007+) 111 312 3 DAJCC7T dajcc7t Derived AJCC T 7th ed 112 315 3 DAJCC7N dajcc7n Derived AJCC N 7th ed 113 318 3 DAJCC7M dajcc7m Derived AJCC M 7th ed 114 321 3 DAJCC7STG dajcc7stg Derived AJCC 7 Stage Group 115 324 2 ADJTM_6VALUE adjtm6value Adjusted AJCC 6th T (1988+) 116 326 2 ADJNM_6VALUE adjnm6value Adjusted AJCC 6th N (1988+) 117 328 2 ADJM_6VALUE adjm6value Adjusted AJCC 6th M (1988+) 118 330 2 ADJAJCCSTG adjajccstg Adjusted AJCC 6th Stage (1988+) 119 332 3 CS7SITE cs7site CS Site-Specific Factor 7 120 335 3 CS9SITE cs9site CS Site-specific Factor 9 121 338 3 CS12SITE cs12site CS Site-Specific Factor 12 122 341 1 HER2 her2 Derived HER2 Recode (2010+) 123 342 1 BRST_SUB brstsub Breast Subtype (2010+) 124 348 1 ANNARBOR annarbor Lymphoma - (rdf=pickFields(df))#picks a subset of SEER fields and defines their types
     start width   sasnames        names                                          desc   type
casenum      1     8   PUBCSNUM      casenum                                    Patient ID integer
reg          9    10        REG          reg                                 SEER registry integer
[... additional fields shown ...]
radiatn    394     1   RADIATNR      radiatn                             Radiation Recode integer
chemo      397     1 CHEMO_RX_REC        chemo           Chemotherapy recode (yes, no/unk) integer

mkSEER(rdf) #makes merged data file ~/data/SEER/mrgd/cancDef.Rdata
Making population file data.tables
Removing SEER 9 person years from: /Users/radivot/data/SEER/populations/expanded.race.by.hispanic/yr1992_2016.seer9.plus.sj_lx_rg_ak before pooling into one file.
The population files of SEER were processed in 11.968 seconds.
Cancer Data: The following fields will be written: The population files of SEER were processed in 11.968 seconds. Cancer Data: The following fields will be written: [1] "casenum" "reg" "race" "sex" "agedx" "yrbrth" "seqnum" "modx" "yrdx" "histo3" "ICD9"
[12] "COD" "surv" "radiatn" "chemo"
[1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/BREAST.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/DIGOTHR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/MALEGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/FEMGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/OTHER.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/RESPIR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/COLRECT.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/LYMYLEUK.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1975_2016.seer9/URINARY.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/BREAST.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/DIGOTHR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/MALEGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/FEMGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/OTHER.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/RESPIR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/COLRECT.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/LYMYLEUK.TXT" [1] "/Users/radivot/data/SEER/incidence/yr1992_2016.sj_lx_rg_ak/URINARY.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/BREAST.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/DIGOTHR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/MALEGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/FEMGEN.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/OTHER.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/RESPIR.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/COLRECT.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/LYMYLEUK.TXT" [1] "/Users/radivot/data/SEER/incidence/yr2000_2016.gc_ky_la_nj_gg/URINARY.TXT" [1] "using bind_rows() to make DF canc" Cancer files were processed in 50.72 seconds. [1] "save()-ing DF canc to disk" Cancer data has been written to: /Users/radivot/data/SEER/mrgd/cancDef.RData size mtime /Users/radivot/data/SEER/mrgd/cancDef.RData 133 M 2019-12-07 09:12:38 /Users/radivot/data/SEER/mrgd/cancRS.RData 134 M 2019-10-12 21:03:08 /Users/radivot/data/SEER/mrgd/cancStgs.RData 175 M 2019-07-08 19:28:30 /Users/radivot/data/SEER/mrgd/cancSurg.RData 138 M 2019-04-16 19:54:28 /Users/radivot/data/SEER/mrgd/cancTNBC.RData 133 M 2019-10-27 09:11:44 /Users/radivot/data/SEER/mrgd/popsa.RData 1 M 2019-12-07 09:12:38 /Users/radivot/data/SEER/mrgd/popsae.RData 1 M 2019-12-07 09:12:39

aqdaisat commented 4 years ago

Good morning. I just figured out what is the problem. It turns to have this issue because of the home directory. it should be as you recommended " /Users/username/Documents." and it wont work if it was not and gives that error. Now everything is working smoothly. Thank you so much for helping with this.

radivot commented 4 years ago

Great, thanks for figuring it out.