maelh / hxd-plugin-framework

Plugin framework for HxD's data inspector
Mozilla Public License 2.0
166 stars 22 forks source link

BytesToStr should report selection details #3

Open sredna opened 2 years ago

sredna commented 2 years ago

It would be nice if the BytesToStr function had a selection length parameter. This way the plug-in could choose to display only exact matches in the inspector.

... BytesToStr(
    void* ThisPtr,
    uint8_t* Bytes,
    int ByteCount,
    TFormattingOptions FormattingOptions,
    int* ConvertedByteCount,
    const wchar_t** ConvertedStr,
    int ByteSelectionCount
);

ByteSelectionCount would be 0 if there is no selection, just a cursor position change. This is the way the API works today. To only display exact matches a plug-in could choose to do something like:

if (ByteCount >= 4 && (ByteSelectionCount == 0 || ByteSelectionCount == 4))
{
  const wchar_t* match = LookupUINT32Constants(Bytes);
  if (match) ...
} 
if (ByteCount >= 2 && (ByteSelectionCount == 0 || ByteSelectionCount == 2))
{
  ...
}
maelh commented 2 years ago

Currently HxD's datainspector will either use the selected amount of bytes, or if there is no selection, the maximum possible byte count (which is obtained by querying each type converter about the maximum amount of bytes a type's instance can occupy).

This kind of separation of concerns is intentional, the datatype converter is just supposed to get an array of bytes, and interpret only the amount of bytes it needs, not care about how they are obtained.

For example if you had a string, preceded by a length specifier, you would ignore any extraneous bytes. Or for an Int32, only process the first 4 bytes, or return an error if there are less than 4 bytes.

Could you elaborate on why you want to do it differently?

Edit: When ByteSelectionCount = 0 or ByteSelectionCount = 4 an Int32 would return a string, otherwise an error. That means also an error for ByteSelectionCount > 4. Is that what you want to do?

In that case it should be a global option similar to FormattingOptions, because all datatype converters should behave the same. The idea currently is to interpret the sequence of bytes as best as you can (following type rules), and return an error if there are too few of them. There is a white rectangle overlayed on the blueish selection rectangle for each type, so you know how many bytes were actually converted, no matter how long the selection is.

What's the use case? A more complete picture will allow a better design, especially explaining why the white rectangle is not enough.

sredna commented 2 years ago

When ByteSelectionCount = 0 or ByteSelectionCount = 4 an Int32 would return a string, otherwise an error. That means also an error for ByteSelectionCount > 4. Is that what you want to do?

Yes.

There is a white rectangle overlayed on the blueish selection rectangle for each type, so you know how many bytes were actually converted, no matter how long the selection is.

I don't actually see this rectangle? (v2.5.0.0 (x86-64)) Edit: Ah, you have to select the item in the inspector. Still, very hard to see this rectangle.

Just playing around with this API I made a plug-in that displays friendly names of PE fields and the "problem" is that when there is a selection that is longer than the actual type, it still displays the name even though what you have selected is not really that thing.

HxD*3

  1. No selection, I only care about the cursor, not the length (as long as it is long enough).
  2. Selection is two bytes and I found a perfect match.
  3. Selection is too long, clearly not a IMAGEFILE* value.

I can get around this by checking the size directly but then I lose the "no selection" mode being able to identify types. I currently only care about 16 and 32-bit values, nothing else.

maelh commented 2 years ago

In this case I think a global setting as suggested earlier would make the most sense.

sredna commented 2 years ago

Would you able to declare that you want 2, 4 and 8 bytes in such a global setting?

I currently just have a static list of constants that I match against.

static const LPCWSTR g_Constants[] = { 
    MKCONST4(/*IMAGE_*/NT_SIGNATURE, L"\x00", L"\x00", L"\x45", L"\x50"),
    MKCONST2(/*IMAGE_FILE_*/MACHINE_I386, L"\x01", L"\x4c"),
    MKCONST2(/*IMAGE_FILE_*/MACHINE_AMD64, L"\x86", L"\x64"),
    ...
maelh commented 2 years ago

Each datatype converter specifies MaxTypeSize in bytes when CreateConverter is called.

If this potential future option would be activated, the selection would have to match this value or nothing would be shown (or rather "Invalid" is displayed in grey, instead of a value). That means, MaxTypeSize would be interpreted as ExactTypeSize.

I feel what you really want will be better supported by the planned structure viewer/editor.

maelh commented 2 years ago

Now that I know better what you try to achieve, I think what you want is a bit similar to how UTF8Char (more correctly: UTF-8 code point) is handled. It can have a varying byte width, some characters take up 1 byte, while others will occupy 2, 3 or 4.

If you enabled this potential future option, you would be given exactly the selected amount of bytes (or all the bytes up to the buffer size, which is about 128 to 256 bytes when selection length = 0), nothing more or less.

So when there is a selection already, it implies you know something about the type instance already, i.e., its size/length. When there is no selection you just know the start of the type instance.

You want it to also imply to mean the exact length/size of the type, while my current implementation assumes it means the maximum size (to be consistent with the no selection case).

So what I would suggest is to interpret selection length as exact size, with a global option, so you know that in this case ByteCount matches selection length. Edit: See solution suggested in comment below instead.

if (ByteCount >= 4 && (ByteSelectionCount == 0 || ByteSelectionCount == 4)) would become something like this

if (SelectionLengthAndByteCountMatch) // SelectionLengthAndByteCountMatch would be a parameter of BytesToStr()
  Match2_4_8ByteSequence(Bytes, ByteCount, IgnoreExtraneousBytes = False);
else if (ByteCount > 0)
  Match2_4_8ByteSequence(Bytes, ByteCount, IgnoreExtraneousBytes = True); // just process as many as necessary, ignore the extraneous bytes
maelh commented 2 years ago
  • No selection, I only care about the cursor, not the length (as long as it is long enough).

  • Selection is two bytes and I found a perfect match.

  • Selection is too long, clearly not a IMAGEFILE* value.

What you are looking at is IMAGE_FILE_HEADER.Machine which is of the type WORD, which is 2 bytes wide. With padding/alignment it will be 4 bytes wide.

This suggests you want to specify the fixed size of the type using the selection.

That's why it would make most sense to tell the datainspector: limit the amount of bytes you look at to the selection, which already happens. It should also ignore anything that goes beyond that, which also happens.

With a fixed size data type you don't really need to go beyond that, since it's already clear that additional bytes of the selection will be ignored. So what's missing is really visual feedback, that makes it more obvious.

So what really matters is ByteCount vs. ConvertedByteCount. If ByteCount > ConvertedByteCount you would want the datainspector to return nothing, or the grey "Invalid".

So a global option that will check for this condition would be all that is needed. No change to any datatype converter or function prototype.

sredna commented 2 years ago

I feel what you really want will be better supported by the planned structure viewer/editor.

Yes, I think that is what a lot of people want. The inspector is still pretty handy though when the thing you are looking is not a real PE header but you still know what some of the fields are.

What you are looking at is IMAGE_FILE_HEADER.Machine which is of the type WORD, which is 2 bytes wide. With padding/alignment it will be 4 bytes wide.

IMAGE_FILE_HEADER has no padding nor alignment of its members. Even if there was padding it would not be relevant, I just want a quick way to look up fields. This plug-in could easily be modified to grab a list of constants related to whatever you are working on from a .ini to provide useful mappings to human readable strings when looking at hex dumps of custom formats etc.

So what really matters is ByteCount vs. ConvertedByteCount. If ByteCount > ConvertedByteCount you would want the datainspector to return nothing, or the grey "Invalid".

Yes.