Doesn't need much description. One byte can hold 8 cell states, and would make the whole world array smaller by a factor of 8. Further improvements made from this might be "regions" fixed at 8x8 so that a block being simulated would fit in a handful of CPU registers. If these regions could be made so that each row is contiguous, this would further improve speed by taking advantage of spacial locality.
Doesn't need much description. One byte can hold 8 cell states, and would make the whole world array smaller by a factor of 8. Further improvements made from this might be "regions" fixed at 8x8 so that a block being simulated would fit in a handful of CPU registers. If these regions could be made so that each row is contiguous, this would further improve speed by taking advantage of spacial locality.