Open quadrixstefanovic opened 1 year ago
It is how it was implemented
So I came up with a reasonable fix for all these methods that allocate a fixed buffer of size 10 thousand characters. I also posted about this in the community forum, BTW.
There are a lot of methods (29 by my count) in the ScintillaGateway.cs
that look something like this:
public unsafe string DescriptionOfStyle(int style)
{
byte[] descriptionBuffer = new byte[10000];
fixed (byte* descriptionPtr = descriptionBuffer)
{
Win32.SendMessage(scintilla, SciMsg.SCI_DESCRIPTIONOFSTYLE, (IntPtr) style, (IntPtr) descriptionPtr);
return Encoding.UTF8.GetString(descriptionBuffer).TrimEnd('\0');
}
}
This is not the right way to do these things because 10000 might be too small (leading to buffer overflow) or too large (leading to waste).
Regarding methods where the user passes in a buffer to be filled (char *
), the Scintilla documentation says:
Arguments point at text buffers that Scintilla will fill with text.
In some cases, another argument will tell Scintilla the buffer size.
In others, you must make sure that the buffer is big enough to hold the requested text.
If a NULL pointer (0) is passed then, for SCI_* calls,
the length that should be allocated, not including any terminating NUL, is returned.
Some calls (marked "NUL-terminated") add a NUL character to the result but other calls do not:
to generically handle both types, allocate one more byte than indicated and set it to NUL.
ScintillaGateway.cs
SciMsg.SCI_DESCRIPTIONOFSTYLE
as the Scintilla message and int length
as a parameter to the C# method, replace the entire body of the method with
return GetNullStrippedStringFromMessageThatReturnsLength(SciMsg.SCI_DESCRIPTIONOFSTYLE, (IntPtr)style);
public unsafe string GetWhitespaceChars()
, I would replace the whole body of the function with
return GetNullStrippedStringFromMessageThatReturnsLength(SciMsg.SCI_GETWHITESPACECHARS);
Note that in this case the optional wParam
argument is not supplied.
In any case, here is the GetNullStrippedStringFromMessageThatReturnsLength
method in my proposed fix.
/// <summary>
/// returns bytes decoded from UTF-8 as a string, with all trailing NULL bytes stripped off.
/// </summary>
public static string Utf8BytesToNullStrippedString(byte[] bytes)
{
int lastNullCharPos = bytes.Length - 1;
// this only bypasses NULL chars because no char
// other than NULL can have any 0-valued bytes in UTF-8.
// See https://en.wikipedia.org/wiki/UTF-8#Encoding
for (; lastNullCharPos >= 0 && bytes[lastNullCharPos] == '\x00'; lastNullCharPos--) { }
return Encoding.UTF8.GetString(bytes, 0, lastNullCharPos + 1);
}
/// <summary>
/// Recall that all Scintilla methods have the signature
/// (scintilla* scin, SciMsg msg, void* wParam, void* lParam) -> void*<br></br>
/// Many of these scintilla methods are bimodal in the following way<br></br>
/// * if lParam is 0, return the length of the buffer to be filled and have no side effects. The wParam may be involved in telling Scintilla how big the buffer needs to be.<br></br>
/// * if lParam is greater than 0, it is assumed to be a pointer to a buffer. Now the wParam indicates what the text will need to be.<br></br><br></br>
/// This sets lParam to 0 to get the length, allocates a buffer of that length,<br></br>
/// uses the second mode to fill a buffer,<br></br>
/// and returns a string of the UTF8-decoded buffer with all trailing '\x00' chars stripped off.
/// </summary>
/// <param name="msg">message to send</param>
/// <param name="wParam">another parameter for defining what the buffer should contain</param>
/// <returns></returns>
private unsafe string GetNullStrippedStringFromMessageThatReturnsLength(SciMsg msg, IntPtr wParam=default)
{
int length = Win32.SendMessage(scintilla, msg, wParam, (IntPtr)Unused).ToInt32();
byte[] textBuffer = new byte[length];
fixed (byte* textPtr = textBuffer)
{
Win32.SendMessage(scintilla, msg, wParam, (IntPtr)textPtr);
return Utf8BytesToNullStrippedString(textBuffer);
}
}
I may also submit a PR, but I wanted to post this here first.
Why are you shouting?
Why are you shouting?
Fair question, fixed
When using the GetText(int length) method I only get a maximum of exactly 10'000 characters back, independent of the length value.