Open mppf opened 5 years ago
As an amusement, I tried a prototype to see if it sped up some benchmarks.
https://github.com/mppf/chapel/tree/prototype-short-string-opts
It includes two benchmarks it helps with. More work would be required to study further (e.g. pass full local testing).
The first approach, using a C union, did not help so much with the sorting strings benchmark.
The later approach, just leaving the fields there but making a pointer to one or more of them, went OK. But that approach would require us to label that "bufferType" to be possibly overlapping in the LLVM back end (the way char*
can alias anything in C).
Another interesting benchmark would be a push_back test where many strings are added to an array.
Many string API routines currently return a string to represent one character.
For example:
It is nice for API design reasons to keep these methods within the string type. However, that currently comes at a significant performance penalty vs. returning an integer to represent the character. One significant element of this performance penalty is the necessity to allocate and free these strings.
However, there is an optimization strategy used in other languages that can be applied to this case. It's called the short string optimization. In Chapel terms, the short string optimization would be to make the
string
record able to store short strings directly in the record instead of in a pointed-to buffer. This is normally done by re-using storage for other record fields that aren't used when the string is stored directly (e.g. with a C union).The Chapel string record is currently 32 bytes long. That means that it would be possible to store 1 byte of tag, that indicates if the format of the 32 bytes is "string included" or "points to buffer" and, in the event of "string included", stores the length of the string. Even if short strings are always null terminated, this allows for 30 bytes of string data to be stored in the string.
This optimization additionally makes it more likely that a string can be communicated in 1 GET or PUT when Chapel programs are referring to remote strings. (We do RVF short strings already, but RVF is not always possible).
I havn't done a study of string lengths used in Chapel programs in practice but I'm fairly confident that a large fraction of these would be < 30 bytes.
If I were implementing the short string optimization today, what would it look like?
getter
methods could even be used to access these fields by calling the appropriate C functions in the event that they are not changedThis strategy would allow most of the string methods to work with modest updates. The string initialization and append code would need more logic to decide between representing the string as "string included" or "points to buffer". This adds some complexity to the implementation.