SpamExperts / pyzor

Pyzor is a Python implementation of a spam-blocking networked system that use spam signatures to identify them.
GNU General Public License v2.0
139 stars 31 forks source link

Local whitelist file with pyzor hash not working #73

Open Schroeffu opened 6 years ago

Schroeffu commented 6 years ago

Version information

1.0.0

Steps to replicate

  1. Enable PYZOR in spamassassin config (in my case: using with MailScanner):
# Enable PYZOR
ifplugin Mail::SpamAssassin::Plugin::Pyzor
        pyzor_path /usr/bin/pyzor
        pyzor_options --homedir /etc/MailScanner/pyzor/
endif
  1. Add file /etc/MailScanner/pyzor/config with default content from https://github.com/SpamExperts/pyzor/blob/master/config/config.sample

  2. Edit line 22 where to load the local whitelist file like this:

## The `LocaWhitelist` file containing skipped digests.
LocalWhitelist = /etc/MailScanner/pyzor/pyzor.whitelist
  1. Enter the Hash da39a3ee5e6b4b0d3255bfef95601890afd80709 in this file as the only content and only line.

  2. Restart Spamassassin (or MailScanner, that restarts SpamAssassin anyway)

  3. Test Pyzor now with echo "test" | spamassassin -D pyzor 2>&1 | less will throw an error like this:

Oct 15 15:41:26.794 [53608] dbg: pyzor: network tests on, attempting Pyzor
Oct 15 15:41:28.297 [53608] dbg: pyzor: pyzor is available: /usr/bin/pyzor
Oct 15 15:41:28.298 [53608] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir /etc/MailScanner/pyzor/ check < /tmp/.spamassassin53608X6CaQttmp
Oct 15 15:41:28.417 [53608] dbg: pyzor: [53613] finished: exit 1
Oct 15 15:41:28.418 [53608] dbg: pyzor: got response: Traceback (most recent call last):\n File "/usr/bin/pyzor", line 408, in <module>\n main()\n File "/usr/bin/pyzor", line 152, in main\n if not dispatch(client, servers, config):\n File "/usr/bin/pyzor", line 239, in check\n send_digest(digested, mock_runner, servers)\n File "/usr/bin/pyzor", line 262, in send_digest\n _send_digest(runner, servers[0], digested)\n File "/usr/bin/pyzor", line 253, in _send_digest\n runner.run(server, (digested, server))\n File "/usr/lib/python3/dist-packages/pyzor/client.py", line 258, in run\n response = self.routine(*args, **kwargs)\n File "/usr/lib/python3/dist-packages/pyzor/client.py", line 122, in _mock_check\n pyzor.proto_version))\nTypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'int'

Actual result

Oct 15 15:41:26.794 [53608] dbg: pyzor: network tests on, attempting Pyzor
Oct 15 15:41:28.297 [53608] dbg: pyzor: pyzor is available: /usr/bin/pyzor
Oct 15 15:41:28.298 [53608] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir /etc/MailScanner/pyzor/ check < /tmp/.spamassassin53608X6CaQttmp
Oct 15 15:41:28.417 [53608] dbg: pyzor: [53613] finished: exit 1
Oct 15 15:41:28.418 [53608] dbg: pyzor: got response: Traceback (most recent call last):\n File "/usr/bin/pyzor", line 408, in <module>\n main()\n File "/usr/bin/pyzor", line 152, in main\n if not dispatch(client, servers, config):\n File "/usr/bin/pyzor", line 239, in check\n send_digest(digested, mock_runner, servers)\n File "/usr/bin/pyzor", line 262, in send_digest\n _send_digest(runner, servers[0], digested)\n File "/usr/bin/pyzor", line 253, in _send_digest\n runner.run(server, (digested, server))\n File "/usr/lib/python3/dist-packages/pyzor/client.py", line 258, in run\n response = self.routine(*args, **kwargs)\n File "/usr/lib/python3/dist-packages/pyzor/client.py", line 122, in _mock_check\n pyzor.proto_version))\nTypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'int'

Expected result

Whitelist Hash da39a3ee5e6b4b0d3255bfef95601890afd80709 which represents emails without body content but Attachements/Subjects (billings, pdfs from suppliers, etc, huge amount of false positives)

Other notes

What am I doing wrong? Shouldn't be the format of the whitelist-file only a hash line-by-line?

When I add another number at the end like this: da39a3ee5e6b4b0d3255bfef95601890afd80709 0 Pyzor is working again but still not whitelisting this hash.

Schroeffu commented 5 years ago

Would really love a fix :-)

Currently, i can only give 0.01 points to PYZOR because empty mails - only subject/attachement, no text - creates a lot of false positives. :- (

xrat commented 3 years ago

I'd like to clarify that this bug affects all whitelisted messages not just "empty" ones with digest da39a3ee5e6b4b0d3255bfef95601890afd80709.

I am currently testing the following changes:

--- /usr/lib/python3/dist-packages/pyzor/client.py  2017-09-05 16:41:23.000000000 +0200
+++ client.73.py    2021-07-13 18:41:21.448102172 +0200
@@ -119,6 +119,6 @@
     def _mock_check(self, digests, address=None):
-        msg = (b"Code: %s\nDiag: OK\nPV: %s\nThread: 1024\nCount: 0\n"
-               b"WL-Count: 0" % (pyzor.message.Response.ok_code,
+        msg = ("Code: %s\nDiag: OK\nPV: %s\nThread: 1024\nCount: 0\n"
+               "WL-Count: 0" % (pyzor.message.Response.ok_code,
                                  pyzor.proto_version))
-        return email.message_from_bytes(msg, _class=pyzor.message.Response)
+        return email.message_from_bytes(msg.encode(), _class=pyzor.message.Response)

I.e., remove the 2 b encoders in msg = and change msg to msg.encode() in the return statement.

Disclaimer: I do not know Python. The above works for me for now.

mnalis commented 2 years ago

Thanks, it works great!

% echo | pyzor check
public.pyzor.org:24441  (200, 'OK')     24308494        243371
% echo | pyzor local_whitelist
% echo | pyzor check
public.pyzor.org:24441  (200, 'OK')     0       0

can not somebody make a release of this fix?