gnanet / mailzu

MailZu-ng compatible PHP 7.2+, PHP-PDO | Based on zedzedtop/mailzu and SF.net/projects/mailzu/ | MailZu is a simple and intuitive web interface to manage Amavisd-new quarantine. Users can view their own quarantine, release/delete messages or request the release of messages. This fork of MailZu is written in PHP and requires Amavisd-new version greater than 2.7.0
GNU General Public License v2.0
14 stars 7 forks source link

[Needs testing] - [PostgreSQL] bytea typed mail_id and recip_email presented as xNNN... hexa #7

Closed xndr closed 3 years ago

xndr commented 5 years ago

Hi,

In the Site Quarantine, the "mail_id" and "recip_email" variables are coming up as gibberish such as: x3955376a4c7a44702d7a6453, as well when clicking the subject, you can see in the URL the values for mail_id and recip_email are the same gibberish values.

If I manually enter the real mail_id & recipient email address as shown in the Amavis PG DB in the quarantine, the message shows up correctly. Same thing if, in the Site Quarantine, I search by Mail ID and enter the real Mail ID, it only returns the expected message in the search results, although it still displays the wrong/gibberish values in the To and Mail ID columns.

This results in the error message "Message Unavailable" being shown.

using PHP 7.2.24-0ubuntu0.18.04.1 on Ubuntu 18.04, and using the included mimeDecode.php from this repo.

gnanet commented 5 years ago

For me,this seems to be a blob/binary field casted wrongly into a HEXA representation, in short: the problem lies somwhere around the SQL query and the SQL-Server.

Your sample data starts with an "x" what quickly led me to try cyberChef to "unhex" your data:

https://gchq.github.io/CyberChef/#recipe=From_Hex('Auto')&input=eDM5NTUzNzZhNGM3YTQ0NzAyZDdhNjQ1Mwmy suggestion is 

or click here to see the cyberChef recipe that returns a mail_id looking string 9U7jLzDp-zdS

Can you check what collation and field-types are in use where you experience X-values ?

gnanet commented 5 years ago

One thing, that you should check what you have in your config.php set for binquar.

I use this:

$conf['db']['binquar'] = true;
xndr commented 5 years ago

For the hex string, you are correct, that decodes to the specific mail_id I was testing with.

However, switching $conf['db']['binquar'] to true didn't seem to have any effect on the situation.

xndr commented 5 years ago

Can you check what collation and field-types are in use where you experience X-values ?

I am not sure about the collation, but I can include a pg_dump of the schema if that helps? The database was created using the instructions in README.sql-pg that came along with this version of Amavisd-new (2.11.1).

-- Dumped by pg_dump version 10.10 (Ubuntu 10.10-0ubuntu0.18.04.1)

SET statement_timeout = 0; SET lock_timeout = 0; SET idle_in_transaction_session_timeout = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = on; SELECT pg_catalog.set_config('search_path', '', false); SET check_function_bodies = false; SET xmloption = content; SET client_min_messages = warning; SET row_security = off;

SET default_tablespace = '';

SET default_with_oids = false;

-- -- Name: quarantine; Type: TABLE; Schema: public; Owner: postgres

CREATE TABLE public.quarantine ( partition_tag integer DEFAULT 0 NOT NULL, mail_id bytea NOT NULL, chunk_ind integer NOT NULL, mail_text bytea NOT NULL, CONSTRAINT quarantine_chunk_ind_check CHECK ((chunk_ind >= 0)) );

-- Name: quarantine quarantine_pkey; Type: CONSTRAINT; Schema: public;

ALTER TABLE ONLY public.quarantine ADD CONSTRAINT quarantine_pkey PRIMARY KEY (partition_tag, mail_id, chunk_ind);

--

xndr commented 5 years ago
Screen Shot 2019-11-21 at 8 23 03 AM
gnanet commented 5 years ago

First of all, i have no Postgres running, honestly i dont use it, but in general the queries should behave similar.

AND yes, there it is:

mail_id bytea NOT NULL,
mail_text bytea NOT NULL,

Every field that you would expect to return some kind of text-content, but was specified with bytea holds the binary representation of your textcontent.

Found some relevant info here: http://oguridbtech.blogspot.com/2017/12/convert-bytea-to-text-in-postgresql.html

An example query that should return the value as text looks like below:

select encode(mail_id,'escape')  from  public.quarantine;
select encode(mail_text,'escape')  from  public.quarantine;

So for the solution, you could look for database queries in the source,and change the postgresql related queries to reflect the hex to human-readable way those fields have to be "select"-ed.

Also, if you have done that,i would be happy to merge a PR. In case you are not sure, leave thisissue open,and i take care of the changes in the sourcecode

xndr commented 5 years ago

I appreciate your help! Looking at the code in DBEngine.class.php, in get_raw_mail() for example, it seems like it is already doing a "select encode(mail_id,'escape')" type of query if $conf['db']['binquar'] = true; for any DB type other than mysql or mysqli.

Slowly going through the code to see if perhaps a function doesn't do this check and just gets the raw output from the DB.

I also did a test from the CLI in psql and if I just select mail_id,mail_text from quarantine, it outputs the same time of x545465465 hex values, but if I do a select(mail_id, 'escape') from quarantine; it shows the output in human readable format.

*** correction to previous statement, I see now it's only mail_text that's being decoded, not the mail_id and recip_to. I know what needs to be done to correct this, not sure if I will have time to do this today, but I will let you know when I'm done.

gnanet commented 5 years ago

Following a second search, specific to Pgsql, I looked into amavis sql-pg readme

If a data type 'bytea' is chosen for these three fields, the setting $sql_allow_8bit_address MUST be set to true to let the amavisd program use the appropriate data type in SQL commands: $sql_allow_8bit_address = 1; # maddr.email: VARCHAR (0), VARBINARY/BYTEA (1)

I think the changes need to be similar to the already considered binquar for mail_text

$mail_text_column = " encode(mail_text,'escape') AS mail_text";

selecting these

should be done either with

encode(THEFIELDNAME,'escape') AS THEFIELDNAME

or

convert_from(THEFIELDNAME, 'UTF8') AS THEFIELDNAME
xndr commented 5 years ago

My Amavis config file contains:

$sql_allow_8bit_address = 1;

Thank you for the clarification as well.

xndr commented 5 years ago

I got MailZu to work correctly. I will test it further as the day goes by to see if there's other areas that need to have corrected queries, but so far the user and site quarantine summary shows up correctly, allowing me to click on the subject and get see the email.

I only edited/harcoded the existing queries where they were. I haven't made yet the same validations for dbtype = mysql/mysqli vs pg, but I will, time permitting.

For reference, in DBEngine.class.php, in function get_user_messages(), I changed the SQL query as below:

    $query = "SELECT
                    msgs.time_num,
                    msgs.from_addr,
                    encode(msgs.mail_id,'escape') AS mail_id,
                    msgs.subject,
                    msgs.spam_level,
                    msgs.content,
                    msgrcpt.rs,
                    msgs.quar_type,
                    encode(recip.email,'escape') AS email
                    FROM msgs
                    INNER JOIN msgrcpt              ON msgs.mail_id = msgrcpt.mail_id
                    $join_type maddr AS sender      ON msgs.sid = sender.id
                    $join_type maddr AS recip       ON msgrcpt.rid = recip.id
                    WHERE $type_clause"
        // Only check against the email address when not admin
        . ($msgs_all ? ' ' : $emailaddr_clause)
        . " $rs_clause
                    $search_clause
                    AND msgs.quar_type <> ''
                    ORDER BY $order $vert ";
gnanet commented 5 years ago

To enable configurability,and error-free usage for both mysql and pgsql, i will make changes that try to be aware of the DB used, and maybe i'll try to prevent further bytea issues by examining the tables,and proactively switch queries to encode() where i find bytea fields in show create table

gnanet commented 5 years ago

@xndr could you run this SQL Query on your PostgreSQL server, and paste the result back here?

IMPORTANT: replace <the amavis/mailzu database name> with the database name of your amavis setup

USE information_schema;
SELECT TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME,DATA_TYPE FROM `COLUMNS` WHERE (DATA_TYPE LIKE '%binary%' OR DATA_TYPE LIKE '%byte%' OR DATA_TYPE LIKE '%blob%')
AND TABLE_SCHEMA = '<the amavis/mailzu database name>';

If i am right, then i will use this general SQL query to identify fields that need attention.

gnanet commented 5 years ago

I have done this on my system (amavisd-new-2.7.1 ; MySQL 5.6.39)

TABLE_NAME        COLUMN_NAME        DATA_TYPE
quarantine        mail_text          blob
maddr             email              varbinary
mailaddr          email              varbinary
users             email              varbinary
msgrcpt           mail_id            varbinary
msgs              mail_id            varbinary
quarantine        mail_id            varbinary
msgs              quar_loc           varbinary
msgs              secret_id          varbinary
gnanet commented 5 years ago

I just realize, that everytime i looked at the database, i used phpmyadmin, and it was configured to show varbinary as human readable string.

I searched the code of PMA, and found 2 ways to accomplish the conversion within PHP, without changing the SQL query:

<?php
echo "the human-readable mail_id: 08YyXJIvnj2I\n";
echo "to allow working with the binary content, i did a TO_BASE64() in mysql:\n";
echo "SELECT TO_BASE64(CAST(0x30385979584a49766e6a3249 AS BINARY));\n";
echo "the resulting string is: MDhZeVhKSXZuajJJ\n";
echo "every occurence of OUR_BINARY_CONTENT is produced by base64_decode('MDhZeVhKSXZuajJJ')\n";
echo "\n";
echo "NOTE: the CAST-ed hex value above was generated by prepending  \"0x\" to bin2hex(OUR_BINARY_CONTENT):\n";
echo '0x' . bin2hex(base64_decode('MDhZeVhKSXZuajJJ'));
echo "\n";
echo "\n";
echo "\n";
echo "Now let us see the solutions:\n";
echo "\n";
echo "Solution #1\n";
echo "pack(\"H*\",bin2hex(OUR_BINARY_CONTENT))\n";
echo pack("H*", bin2hex(base64_decode('MDhZeVhKSXZuajJJ')));
echo "\n";
echo "\n";
echo "Solution #2\n";
echo "surprisingly simply calling htmlspecialchars(OUR_BINARY_CONTENT) works:\n";
echo htmlspecialchars(base64_decode('MDhZeVhKSXZuajJJ'));
echo "\n";
echo "\n";

And executing the above script:

$ php a-binary2string.php 
the human-readable mail_id: 08YyXJIvnj2I
to allow working with the binary content, i did a TO_BASE64() in mysql:
SELECT TO_BASE64(CAST(0x30385979584a49766e6a3249 AS BINARY));
the resulting string is: MDhZeVhKSXZuajJJ
every occurence of OUR_BINARY_CONTENT is produced by base64_decode('MDhZeVhKSXZuajJJ')

NOTE: the CAST-ed hex value above was generated by prepending  "0x" to bin2hex(OUR_BINARY_CONTENT):
0x30385979584a49766e6a3249

Now let us see the solutions:

Solution #1
pack("H*",bin2hex(OUR_BINARY_CONTENT))
08YyXJIvnj2I

Solution #2
surprisingly simply calling htmlspecialchars(OUR_BINARY_CONTENT) works:
08YyXJIvnj2I
xndr commented 5 years ago

That query you wanted me to run on Postgres didn't work. It failed at the "USE information_schema;" command.

I don't have remote access to this system, but if you need me to do further testing, I can look into it on Monday.

gnanet commented 5 years ago

The motivation behind my question is to have an example output of amavis on pgsql to see how the binary columns look. But i can quickliy spin up a pgsql instance at DigitalOcean, and try the query there.

The use information_schema is where i am not sure how it should be on pgsql, i only hoped, that it is like on mysql.

But after my recent development, i could simply check for the value format, and could create a dynamic workaround purely in php

xndr commented 4 years ago

Hi Gergely, I found another instance where it seems MailZu appears to get the raw value as x12345abcd format rather than the human readable form.

When releasing an email from the quarantine, MailZu fails with a blank error code (no error text in the "Status" column), however, I see this in the amavis.log:

(!!)policy_server FAILED: Secret_id x676a70374d6c556b4d757175 does not match mail_id W42k0xVkLdkn at (eval 106) line 308.

That x676a.... value decodes to the correct secret_id when using CyberChef, gjp7MlUkMuqu.

I haven't dug into the code yet as I'm working on a different application right now, but I have a feeling MailZu is getting the value in the wrong format from the Amavis "msgs" table, and then passing on that wrong variable to Amavis via AM.PDP when trying to release the message.

gnanet commented 4 years ago

Thank you, if it is possible, i would create a changed set of files, that try to solve such issue by detecting the unwanted hex response. If it is possible, i would ask you to try them on that instance, also the exact form of the hex representation would be useful, as i see the amavis release error log is one such good example

xndr commented 4 years ago

For reference, to fix the quarantine release issue between MailZu and Amavis 2.11 w/ a Postgres 10 DB, I modified lib/DBEngine.class.php, under the function get_message($emailaddress, $mail_id) the $query statement at line 534/535:

$query = 'SELECT msgs.time_num, encode(msgs.secret_id,\'escape\') AS secret_id, msgs.subject, msgs.from_addr, msgs.spam_level,' . ' msgrcpt.rs, encode(recip.email,\'escape\') AS email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc'

gnanet commented 4 years ago

@xndr thanks,

I created a branch of the code, that should detect if one of the binary-suspicious fields returns a hex content. It should write it into the mailzu log . If you could set debug on, for some tests, next time you can access the mailzu instances:

$conf['app']['debug'] = 1;

Then the relevant part of mailzu.log on these tests would help me, to decide on the final solution.

You can grab the branch as ZIP here: https://github.com/gnanet/mailzu/archive/mailzu-binary-fields.zip

gnanet commented 4 years ago

@xndr The only problem with your fix is, that it is hard-coding a Postgres-only function, that will cause problems for users, wo run MySQL / MariaDB.

This is why i tried to create a DB-Server independent solution, but without having access to a similar setup where you found the problem, i am unable to test, nor confirm the solution.

For reference, to fix the quarantine release issue between MailZu and Amavis 2.11 w/ a Postgres 10 DB, I modified lib/DBEngine.class.php, under the function get_message($emailaddress, $mail_id) the $query statement at line 534/535:

$query = 'SELECT msgs.time_num, encode(msgs.secret_id,'escape') AS secret_id, msgs.subject, msgs.from_addr, msgs.spam_level,' . ' msgrcpt.rs, encode(recip.email,'escape') AS email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc'

xndr commented 4 years ago

I agree. I only made a hardcoded edit because I needed something to work very quickly to replace a Barracuda that died. I will try to clean up my edits to be DB agnostic (test if the DB is Postgres or not, and go with the modified queries only if the backend is Postgres).

I should have time to do that over the holidays.

gnanet commented 4 years ago

Do you see a chance to try my db agnostic solution?

gnanet commented 3 years ago

Got no new feedback in a year, closing for now.