static-frame / arraykit

Python C Extensions for StaticFrame
Other
8 stars 2 forks source link

Optimize `CodePointLine` by storing unicode size with offsets #103

Closed flexatone closed 1 year ago

flexatone commented 1 year ago

While tracking the unicode size per record is certainly possible, reading from a variable sized buffer will be tricky. Further, many functions take a Py_UCS4 pointers: these would need to be made generic somehow to handle all three sizes. This seems too difficult in C.