abbaye / WpfHexEditorControl

Wpf Hexeditor is a powerful and fully customisable user control for editing file or stream as hexadecimal, decimal and binary. Can be used in Wpf or WinForm application
https://www.nuget.org/packages/WPFHexaEditor/
Apache License 2.0
812 stars 135 forks source link

Relative Search #45

Open vector-man opened 5 years ago

vector-man commented 5 years ago

Add a relative search feature.

abbaye commented 5 years ago

Soon as possible my friend

abbaye commented 5 years ago

@vector-man can you send me a like to watch a sample of your request. Last time I have worked on relative search is in ~2003. I just need to see again how it work to implement in C# control

vector-man commented 5 years ago

Here's an existing tool: https://www.romhacking.net/utilities/513/

It's used to figure out the encoding of text in files (e.g. ROMs)

Here's how the person I requested this for descibed it:

"1- Open game and play until the text you want to figure the encoding for shows up. Say it has the word "World". Make mental note of that. 2- Open relative search tool. I recommend monkeymoore from rhdn. Search for that word "World". 3- It gives a few proposed encodings. Check the one that autocompletes more ingame text you saw in the game, that's the correct one. 4- Save table file. It should have just the A-Z a-z set, I think. (that applies to Japanese, of course, and characters like space and numerals) in that case, you're supposed to look up the font graphics and write the set manually, and feed that to monkeymoore's "define character set" option."

So, it's basically a tool to help figure out an encoding by searching with a string of text. I have not personally done this since around 2003 myself, so I'm sorry I can't personally add anything else to that. They wanted it added to your editor.

abbaye commented 5 years ago

Thanks you

abbaye commented 5 years ago

I think I can use the TBL loaded in editor for make a relative search.

htdag commented 3 years ago

Relative Search, string search without knowing the charset in advance. The Theory of Relative Searching: https://www.romhacking.net/documents/742/

abbaye commented 3 years ago

Thank you @htdag

ulissesemuman commented 4 months ago

I'm creating a visual tool for translating ROMs and I will attach its Hex Editor to my project. I have already implemented Relative Search in my tool. I will soon move my tool to public on GitHub.

Feel free to use or adapt my code for yourself.

    public int RelativeSearch(string textToFind)
    {
        int romLength = binaryFile.Length;
        byte[] binaryFileChunk;
        int textToFindLength = textToFind.Length;
        byte[] binaryText = Encoding.UTF8.GetBytes(textToFind);
        int index = -1;
        int chunkSize = 0x10000;

        binaryText = binaryText.Select(x => (byte)(x - (byte)'a')).ToArray();

        for (int i = 0; i < 255 - (byte)'a'; i++)
        {
            for (int j = headerSize; j < romLength; j += chunkSize - textToFindLength)
            {
                binaryFileChunk = binaryFile.Skip(j).Take(chunkSize).ToArray();

                index = IndexOfArray(binaryFileChunk, binaryText);

                if (index > 0)
                {
                    index += j;
                    break;
                }
            }

            if (index > 0) break;

            binaryText = binaryText.Select(x => (byte)(x + 1)).ToArray();
        }

        return index;
    }

    public static int IndexOfArray<T>(T[] source, T[] search)
    {

        var result = Enumerable.Range(0, source.Length - search.Length)
                               .Select(i => new
                               {
                                   Index = i,
                                   Found = source.Skip(i)
                                      .Take(search.Length)
                                      .SequenceEqual(search)
                               })
                               .FirstOrDefault(e => e.Found);
        return result == null ? -1 : result.Index;
    }

I only use lowercase letters, as the distance between lowercase and uppercase letters in the ROM table may not be the same distance between lowercase and uppercase letters in the ASCII table. I am considering that whoever is translating does not yet have a "relative table". I will implement improvements to allow the use of "relative tables". In my view, a relative table is created through the image of the character sprites.