Closed DocOtak closed 3 months ago
Initial testing shows a 20 to 40x speed up in the extract_numeric_precisions function when converted to use the new numpy.strings module (80ms -> 3ms, or 1.2s -> 30ms for an even larger input of 1.5m strings)
Overall parsing a large (s04p) ctd dataset shoed a 3x speedup (15s -> 5s).
This was on an M1 machine, not sure how well x86 would do, but I suspect there might be similar speedups.
The string processing speedups have been implemented, numpy >=2 is now required
Looks like the numpy 2.0 release has occurred (or is in progress). This library does a bunch of string manipulation that has been moved to https://numpy.org/devdocs/reference/routines.strings.html#module-numpy.strings, while not explicitly deprecated yet, the existing numpy.char module will not be getting updates.
Bonus: performance testing of both methods to see if there are any changes there.