merces / libpe

The PE library used by @merces/pev
http://pev.sf.net
GNU Lesser General Public License v3.0
115 stars 40 forks source link

Extending libpe to support hashes #14

Closed boddumanohar closed 7 years ago

boddumanohar commented 7 years ago

PS: This is my first pull request on C language. So Please be gentle ;-)

boddumanohar commented 7 years ago

partially solves issue #13

boddumanohar commented 7 years ago

Installation instructions:

$ make
$ sudo make install
$ gcc exe-check.c -o exe-check -std=c99 -L. -lpe -lssl -lcrypto -lm
jweyrich commented 7 years ago

@boddumanohar, thank you for your contribution!

We agree that most of the intelligence that's present in pev should be moved to libpe so others can benefit more easily, but that is a task that we haven't had the time to plan+execute, yet. IMO, the toughest part is the API design, but that's doable given the time and efforts.

I have a couple of suggestions for your PR(s):

  1. Those fuzzy.[ch] files are from a 3rd party library that pev depends on to calculate ssdeep hashes. Being so, we try not to change them unless we have fixes for their code. If we come across a bug or fix something there, we'd love to report/contribute back to its official repository (http://ssdeep.sourceforge.net). Likewise, if there's a fix in the official repository, we can simply pull +merge it without worrying about local conflicts. My suggestion then is to create a new source file and move all your code there - maybe extended_api.h and extended_api.c.

  2. Some functions you introduced do not return any value(s) (example, https://github.com/merces/libpe/pull/14/commits/2b48383c388a532520fd9ace449494ad5174dbb9#diff-5aea84a29d889dc6ab538465fc5814f9R698). I believe that these functions can do better than just printing to stdout (e.g.: via printf). They can return the information in a structured way so people's programs can use the information the way they see fit. The solution I see here is to define public struct types to be returned. This task will require a fair amount of thinking and work because a library cannot (or should not) break APIs too often - luckily we haven't reached 1.0 yet :-) Still, that's the main point that kept us from doing it until now - but we are open to receive contributions and hopefully your contribution will help us define a path and patterns to follow.

Sorry for showing up here so late - I was covering the vacation period of a coworker and couldn't find the time to review and comment on your contribution. We hope that your GSoC succeeds and that pev/libpe can help the project you're working on.

Again, thank you for contributing!

boddumanohar commented 7 years ago

I've made some changes. Currently, the most important problem that we are seeing is "how we should return an error?" and in fact, this the most important block that we need to cross.

My final submissions are getting near. So can you please make quick comments on the naming convention of headers file or any thing related to names. So that I can finalize my code in Golang and next we can do under the hood optimizations.

boddumanohar commented 7 years ago

Hi, I am just back at the University after a great summer :). From our discussions, I've modified the structs so that they return an error (also type of error) and also I've implemented a method to free the allocated memory allocated.

Can you please have a look the above 2 latest commits and check if we are in right direction.

boddumanohar commented 7 years ago

The code compiles in when I ran on ubuntu server 16.04. I used Valgrind to find memory leaks. As for exports there are no memory leaks :). But in PERES I see a memory leak in discoveryNodesPeres at malloc But we already have a function to free all the nodes. Even then why does it shows memory leaks? In the afternoon, I was working to find the reason for this behavior but could not get anything :(. How do we solve this problem? (this is my first time check time checking for memory leaks.)

Now I am working adding err to imports and hashes. Also, working on creating delloc function for both.

Let's fight until we have 0 memory leaks :D

boddumanohar commented 7 years ago

Currently, I've implemented pe_err_e for all the headers and also added dealloc functions. We have memory leaks in discoveryNodesPeres and crypto library (i.e in hashes.c)

Also can you please suggest me the tools that I should run this code against to find bugs related to memory and CPU.

Can you please add more comments to the code.

boddumanohar commented 7 years ago

Should we also add other stuff like Header guards? Also, any other things that we need to put into header files to polish the code.

jweyrich commented 7 years ago

Should we also add other stuff like Header guards?

Definitely.

Also, any other things that we need to put into header files to polish the code.

  1. Make sure indentation is uniform (we use tabs);
  2. Put spaces before and after = signs
  3. We normally put } else { all in the same line.

Well, there's not much to it. I believe we could consider adopting a linter (and have its config versioned) to alert about inconsistent code formatting.

One thing that I plan to improve later is the header dependency tree. Other headers shouldn't need to include pe.h IMO. It should be the other way around - but don't worry about it right now.

jweyrich commented 7 years ago

Oh, remembered another important thing: we should prefix all public structs and functions (those in the headers) using pe_. For example, get_tls_callback should be named pe_get_tls_callback, hash_t becomes pe_hash_t, and so forth. I only noticed this today. My bad.

boddumanohar commented 7 years ago

Okay fixed the naming conventions according to our secret conventions :) Also added converted to spaces to tabs, and indented everything properly.

I’ve analyzed the code with clang static code analyzer. I fix some of the bugs. But I could not fix some of the bugs

1.

hashes.c:504:3: warning: Access to field 'next' results in a dereference of a null pointer (loaded from variable 'head')

2.

peres.c:206:8: warning: Assigned value is garbage or undefined node = node->nextNode;
                                                                    ^ ~~~~~~~~~~~~~~

3.

hashes.c:429:26: warning: Function call argument is an uninitialized value
for (unsigned i=0; i < strlen(el->function_name); i++)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~

Should we actually care about these warnings?

Please note that August 21 is the last date for submission of my GSoC final work product. My mentor has accepted to show this pull request as one the product that I contributed to: throughout the summer. I am going to show: LIBPE contribution + contributions that I made at HolmesProcessing.

I completed my GSoC assignments and waiting for LIBPE to get merged.

So, nothing to hurry. we still have a lot of time before Aug 21. I would like to test vigorously LIBPE against various bugs that we commonly have. I would like to test it with various tools. Do you have any names of the tools that you want me to test the code on? Just names are enough ;).

boddumanohar commented 7 years ago

As a proof of Concept, I've implemented everything that I coded until now in a C file. The file can be found at https://github.com/boddumanohar/poc-libpe/blob/master/libpe.c

Hope this helps testing library faster.

boddumanohar commented 7 years ago

I've implemented everything that you have told me. so I will be waiting for your side to test it and merge it if everything is fine or let me know if there are any errors.

Thanks

jweyrich commented 7 years ago

Thanks! I managed to talk to @merces and we're going to merge your PR into master. This will break the pev build, but I already fixed it in my local repository.

jweyrich commented 7 years ago

I also took the liberty to updated your demo to compile with some changes I made after the merge. Feel free to copy it and commit to your demo repository if you wish.

webstergd commented 7 years ago

Hey guys! Thank you for your amazing support during GSoC. Boddumanohar passed with flying colors and we are really proud of the work he did. Your contributions were critical to his success...so seriously thank you.

If you are curious on what he did: https://www.holmesprocessing.com/gsoc/#portfolioModal5