hasherezade / pe-sieve

Scans a given process. Recognizes and dumps a variety of potentially malicious implants (replaced/injected PEs, shellcodes, hooks, in-memory patches).
https://hshrzd.wordpress.com/pe-sieve/
BSD 2-Clause "Simplified" License
2.97k stars 420 forks source link

Incremented buffer size for ignored files #99

Closed ladislav-zezula closed 2 years ago

ladislav-zezula commented 2 years ago

Hi, we've come to the point where the amount of files that we need to ignore has reached its limit. As a temporary fix, I incremented the size of the buffer to 2048, doubled from previous 1024.

It may be worth it to replace it with some dynamic sized variable, however.

ladislav-zezula commented 2 years ago

Please, hold on this pull request. When I have time, I'll update it once more to have dynamic-allocated buffer.

hasherezade commented 2 years ago

@ladislav-zezula - yes, you are right, dynamic buffer will be much better. I used the static one temporarily, because I thought the cases where it will exceed that limit are going to be rare enough. but it seems it reached to this point when I should implement it in a proper way. I am going to take care of this soon. but if you come up with the working solution before, of course you are welcome.

hasherezade commented 2 years ago

I can also add an option of supplying the ignore list as a file. If there are so many entries, I think it will come more handy.

hasherezade commented 2 years ago

@ladislav-zezula - check this: https://github.com/hasherezade/pe-sieve/tree/mignore_refact

added a new parameter /mignorel that allows to supply a file of unlimited length.

ladislav-zezula commented 2 years ago

but if you come up with the working solution before, of course you are welcome.

I'll do it and then you can add your intended /mignorel on top of that.

I can also add an option of supplying the ignore list as a file. If there are so many entries, I think it will come more handy.

That would indeed be cool. Maximum length of the command like in Windows is 0x7FFF characters, so it will eventually run out :-)

check this: https://github.com/hasherezade/pe-sieve/tree/mignore_refact

I don't see any recent commit there. I can use that branch as target for my merge request tho.

ladislav-zezula commented 2 years ago

Ok I turned both fixed-size buffers (output_dir and modules_ignored) into std::string.

I also fixed all MSVC warnings, except one, see below. MS Visual studio really doesn't like that one and spits like a page full of description about how is this bad. It may be worth a separate pull request, tho:

    virtual bool parse(const wchar_t *arg)
    {
        std::wstring value = arg;
        std::string str(value.begin(), value.end());         <<<=== Disliked by MSVC
        return parse(str.c_str());
    }

You may add your new parameter on top of this.

hasherezade commented 2 years ago

Ok I turned both fixed-size buffers (output_dir and modules_ignored) into std::string.

@ladislav-zezula - the point is, neither output_dir nor modules_ignored are allowed to be std::string. PE-sieve can be also compiled as a DLL, and used for the projects written in other languages. Some people use the PE-sieve DLL via Python. It needs to keep an interface that is compatibile with pure C. Everything that is in pe_sieve_types.h must contain simple types only, so only cstrings are accepted.

Those limitations are not present in HollowsHunter, that's why it do have parameters defined as strings.

ladislav-zezula commented 2 years ago

Oh, I see. I didn't know that. Please, tell me - are buffers allocated by new[] OK? I mean, will there always be destructor who will free the buffers?

hasherezade commented 2 years ago

@ladislav-zezula - no, we cannot rely on new and delete - they are from C++ and I want to keep it to pure C, because of the reasons that I mentioned. If we really have to use dynamic buffers, I would rather use a structure such as UNICODE_STRING - but it will require some more refactoring from my side. Also it will be a bigger change in the API, and this may upset some users. So I don't want to do it in a rush.

ladislav-zezula commented 2 years ago

Well it doesn't need to be new/delete, but malloc/free (or even HeapAlloc/HeapFree) would be perfectly fine. But if you want an UNICODE_STRING, that would be fine too, except that UNICODE_STRING has limit of 32767 characters.

hasherezade commented 2 years ago

Well it doesn't need to be new/delete, but malloc/free (or even HeapAlloc/HeapFree) would be perfectly fine.

I get it, you just want that allocation and deallocation of the string will be fully on the user. We can do this, but then it will require additional verification of the buffer. I am really not keen on passing the pointer to the string of unknown length, that's why I would rather go for wrapping it into some structure. I proposed UNICODE_STRING just because it is already defined and well-known, but it may as well be something custom, such as:

typedef struct _PARAM_STRING {
  size_t Length;
  char*  Buffer;
} PARAM_STRING;

I will experiment with it a bit, and propose some code changes today.

hasherezade commented 2 years ago

@ladislav-zezula - check this out: https://github.com/hasherezade/pe-sieve/tree/mignore_refact2 listing of the changes: https://github.com/hasherezade/pe-sieve/compare/mignore_refact2

ladislav-zezula commented 2 years ago

Looks good.

One more note - you may also wanna consider making the output directory UNICODE and also dynamic - in that case, potential length of path is up to 32767. However, currently you use ANSI strings, so unless you want to support Asian languages, it looks good to me.

hasherezade commented 2 years ago

@ladislav-zezula - ok, I merged my changes to the master. they are available in both pe-sieve and hollowshunter. In the future I am planning to change all the used strings to unicode, but it will require some refactoring, and for now I have other priorities. I think the feature is good to go, so I am closing this issue. Please let me know after checking it, if everything works as you expected.

ladislav-zezula commented 2 years ago

The feature looks good. Good job, thank you!