ntop / nDPI

Open Source Deep Packet Inspection Software Toolkit
http://www.ntop.org
GNU Lesser General Public License v3.0
3.86k stars 902 forks source link

Optimize performance of ndpi_strnstr() and possible bugfix #2494

Closed pasabanov closed 4 months ago

pasabanov commented 4 months ago

Please sign (check) the below before submitting the Pull Request:

2433 #2439 #2447

Describe changes:

I made some minor changes to the current optimized substring search.

The main changes are:

  1. Removed len == 0 check from the first if statement. Now the function doesn't immediately return NULL if len is zero.
    I believe that the previous behavior was a bug, because if needle_len is zero, the function should return haystack, because the empty substring does reside at the start of the empty string.
  2. Changed the order of hs_real_len computation and if (needle_len == 0) check.
  3. Removed the if (needle_len == 1) check because it only provides a performance benefit in specific cases. If you need to frequently search for single-character substrings, you might want to consider reverting this change.
  4. Replaced variable haystack_end with variable end_of_search (the byte after the last interesting byte).
    Using end_of_search inside the memchr function instead of haystack_end eliminates the need for the current + needle_len <= haystack_end check, as it is always positive if current is not NULL.

I also added the new function to the performance tests. In practice ndpi_strnstr_opt2 is slightly faster than ndpi_strnstr_opt in most runs, though sometimes ndpi_strnstr_opt appears to be faster.

pasabanov commented 4 months ago

Actually,

  if (needle_len > hs_real_len)
  {
    return NULL;
  }

can be removed too because, in that case, the loop won't perform any iterations and the function will return NULL anyways.

But I'm nut sure if this will increase or decrease performance in your case.

IvanNardi commented 4 months ago
1. Removed `len == 0` check from the first `if` statement. Now the function doesn't immediately return `NULL` if `len` is zero.
   I believe that the previous behavior was a bug, because if `needle_len` is zero, the function should return `haystack`, because the  empty substring does reside at the start of the empty string.

Could you add a unit test about that case in strnstrUnitTest(), please? @pasabanov, could you rebase?

pasabanov commented 4 months ago

@IvanNardi, I've added unit test cases for zero length in strnstrUnitTest() and rebased.

pasabanov commented 4 months ago

@IvanNardi, what do you think about removing the checks if (needle_len == 1), which I've already removed, and if (needle_len > hs_real_len), which I suggest removing? They're part of the general case, so if you don't search for one-character needles or needles that are longer than haystacks often, then there might be no point in optimizing these cases.

0xA50C1A1 commented 4 months ago

@IvanNardi, what do you think about removing the checks if (needle_len == 1), which I've already removed, and if (needle_len > hs_real_len), which I suggest removing? They're part of the general case, so if you don't search for one-character needles or needles that are longer than haystacks often, then there might be no point in optimizing these cases.

Actually there are some cases where ndpi_strnstr is used to search for a single character, that's why I added this optimization. Well, thanks anyway, great work!

mgcp.c

    endpoint = ndpi_strnstr((char const *)packet->payload + 5, " ", packet->payload_packet_len - 5);
    if (endpoint == NULL)
    {
      break;
    }
    endpoint++;

    mgcp = ndpi_strnstr(endpoint, " ", packet->payload_packet_len - ((u_int8_t const *)endpoint - packet->payload));
    if (mgcp == NULL)
    {
      break;
    }
    mgcp++;

ethereum.c

    } else if(ndpi_strnstr((const char *)packet->payload, "{", packet->payload_packet_len)

xiaomi.c

        ptr = ndpi_strnstr((const char *)&payload[offset], ":", len);

tivoconnect.c

 for (newline = ndpi_strnstr(payload, "\n", payload_len);
       newline != NULL;
       key = ++newline,
       newline = ndpi_strnstr(newline, "\n", payload_len - (newline - payload)))
  {
pasabanov commented 4 months ago

Actually there are some cases where ndpi_strnstr is used to search for a single character, that's why I added this optimization. Well, thanks anyway, great work!

Thanks!

In that case, I think I should add that check back.

What do you think about the second check if (needle_len > hs_real_len)? Is it needed too?
It saves a fairly small number of instructions when the needle is longer than the haystack.

0xA50C1A1 commented 4 months ago

Actually there are some cases where ndpi_strnstr is used to search for a single character, that's why I added this optimization. Well, thanks anyway, great work!

Thanks!

In that case, I think I should add that check back.

What do you think about the second check if (needle_len > hs_real_len)? Is it needed too? It saves a fairly small number of instructions when the needle is longer than the haystack.

I dunno, I guess you could keep it. Usually the compiler manages skipping redundant checks.

pasabanov commented 4 months ago

I dunno, I guess you could keep it. Usually the compiler manages skipping redundant checks.

Ok. I've returned the removed check if (needle_len == 1), but I switched its order with the if (needle_len > hs_real_len) check because I think that would be more efficient.

By the way, the compiler doesn't remove the check if (needle_len > hs_real_len) at any level of optimization. I checked that with latest versions of gcc and clang on godbolt.org.

0xA50C1A1 commented 4 months ago

I dunno, I guess you could keep it. Usually the compiler manages skipping redundant checks.

Ok. I've returned the removed check if (needle_len == 1), but I switched its order with the if (needle_len > hs_real_len) check because I think that would be more efficient.

By the way, the compiler doesn't remove the check if (needle_len > hs_real_len) at any level of optimization. I checked that with latest versions of gcc and clang on godbolt.org.

Alright, it looks fine. You can take a look at ndpi_memmem if you want, because ndpi_strnstr is basically a copypaste of that function.

pasabanov commented 4 months ago

Alright, it looks fine. You can take a look at ndpi_memmem if you want, because ndpi_strnstr is basically a copypaste of that function.

I took a look at ndpi_memmem and saw that it has the same issues. I've created pull request #2499.

IvanNardi commented 4 months ago

@pasabanov, thank you for this contribution @0xA50C1A1, thanks for the review