Closed toliwaga closed 6 years ago
I created this issue because I didn't want to forget about it. I am not sure it makes sense to assign it to anyone at this time.
I'll create an activitysim issue for this - https://github.com/ActivitySim/activitysim/issues/220
Currently all NAICS columns are string type. Performance (and perhaps memory) suffers for very large datasets when the strings (e.g. NAICS) could be factors
For performance, they should be factors. This requires some thought as currently pipeline store doesn't handle factors (just numeric and string types). This is really an activitysim defect, but currently afreight is hurting most from its absence.
There are some issues in the R code where factors need to be rebuilt from strings because (I think) of difference in NAICS flavors (e.g. NAICSio vs NAICS2007) - perhaps we cold create an omnibus factor that includes the intersection of all flavors at startup so we don't need to convert?...