Kevin-Jin / mmap

Forked from https://r-forge.r-project.org/scm/?group_id=648
1 stars 1 forks source link

Refactor `mmap_extract()` and `mmap_replace()` #15

Closed Kevin-Jin closed 7 years ago

Kevin-Jin commented 7 years ago

The two monolithic functions are way too long and are indented way too much. The logic for atomic vectors within lists (i.e. data.frame columns) is not consistent with the logic for atomic vectors on the top level (e.g. raw, logical, integer, double, complex, character). For example, (1) some out of bounds checks are just missing when handling the atomic vector inside lists, and (2) LGLSXP apparently is handled at the top level but not within VECSXP. Add some helper functions for each atomic type and call them when encountering atomic vectors on the top level and inside lists.

Within each helper function, figure out a way to reduce the amount of repeated code that handles the different permutations of fieldSigned/isSigned and Cbytes. Macros may be the best bet here, e.g. we can pass in the type (int16_t, uint16_t, int32_t, or uint32_t ), the handle to the appropriate swap macro (swapb16() or swapb32()), and the appropriate buffer (sbuf or intbuf) as macro parameters.

Remove C code that handles int24() so that we don't need to write the swapb24() loop ourselves.

Kevin-Jin commented 7 years ago

Same goes for the mmap_compare() function.

Kevin-Jin commented 7 years ago

CPLXSXP needs to be supported inside VECSXP in mmap_replace().

Kevin-Jin commented 7 years ago

The type of STRSXP columns in a VECSXP should also be checked in mmap_replace() and coerced if necessary.

Kevin-Jin commented 7 years ago

Consider making use int32_t, uint8_t, etc. from stdint.h in the C code rather than long and char and consider renaming the Ctypes accordingly.