gnanet / mailzu

MailZu-ng compatible PHP 7.2+, PHP-PDO | Based on zedzedtop/mailzu and SF.net/projects/mailzu/ | MailZu is a simple and intuitive web interface to manage Amavisd-new quarantine. Users can view their own quarantine, release/delete messages or request the release of messages. This fork of MailZu is written in PHP and requires Amavisd-new version greater than 2.7.0
GNU General Public License v2.0
14 stars 7 forks source link

Rewrite database to use PHP PDO #11

Closed SoniXtrA closed 3 years ago

SoniXtrA commented 3 years ago

Hi, Could it be possible to make Mailzu PHP PDO compliant ? With PHP7.3 (Debian 10) and an up to date MySQL release, some functions are deprecated (PEAR, use of mysqli, etc ... ) :( it seems there is no alternative to manage Amavis Quarantine .... So could it be possible to make the update ? If you need testing, let me know Regards

gnanet commented 3 years ago

Actually i agree, there is almost no solution, (maybe some deeply intgrated software-packages) in this field, or like amacube for roundcube, abandoned.

I am not a programmer, most stuff i added were fixes, some minor changes, so a full rewrite to be PHP7.3+ or now better 8.0 compatible, would be out of my possibilities.

But give me some time, i need to have a closer look what amount needs to be changed to be compatible

gnanet commented 3 years ago

Done a bit research, some coderefactoring could be done using rector-php, but replacing the pear/db part with pdo cannot be done in the same/similar automatized way.

But actually most of the calls are almost similar:

function db_connect() is mainly this with PEAR/DB:

$dsn = $this->dbType . '://' . $this->dbUser . ':' . $this->dbPass. '@' . $this->dbHost . '/' . $this->dbName;
$db = DB::connect($dsn, true);
if (DB::isError($db)) {
    echo $db->getMessage();
}
$db->connection->set_charset('utf8');
$db->setFetchMode(DB_FETCHMODE_ASSOC);
return $this->db = $db;

which would be written with PDO:

try {
    $dsn = $this->dbType . ':host=' . $this->dbHost . ';dbname=' . $this->dbName;
    $db = new PDO($dsn, $this->dbUser, $this->dbPass);
} catch (PDOException $e) {
    echo 'Connection failed: ' . $e->getMessage();
}
$db->exec("set names utf8");
$db->setAttribute(PDO::ATTR_DEFAULT_FETCH_MODE,PDO::FETCH_ASSOC);
return $this->db = $db;

Above structure shows, it would be possible to rewrite the database layer, the most important calls would be:

$this->db->prepare($query);
$this->db->execute($q);
$this->db->execute($q, $values);

lets see if i can manage this

gnanet commented 3 years ago

Continued to examine the migration of PEAR/DB -> PDO, and collected some samples.

NOTE: these are only sample codes, not tested, and may need to be implemented in different parts of the mailzu-code, not as a single function (like the check_for_error() which needs its own rewrite )

Current read-type DB function is roughly the following:

function get_some_database_data()
{
    global $conf;
    $rval = array();
    $query = "-- SOME SELECT QUERY";
    // Prepare query
    $q = $this->db->prepare($query);
    // Execute query
    $result = $this->db->execute($q);
    // Check if error
    $this->check_for_error($result, $query);

    // In case the resulting count of rows matters
    $this->numRows = $result->numRows();

    // Process resulting rows
    while ($rs = $result->fetchRow())
    {
        // $rval_array gets constructed by processing per row resultset $rs 
    }

    // Free up the result
    $result->free();

    // return data
    return $rval;
}

IMPORTANT: NEVER USE fetchAll() for a rowcount // In case the resulting count of rows matters $this->numRows = (int) count($q->fetchAll());

The PDO variant would be then:

function get_some_database_data_pdo()
{
    global $conf;
    $rval = array();
    $query = "-- SOME SELECT QUERY";
    // Prepare query
    $q = $this->db->prepare($query);
    // Execute query
    $result = $q->execute();
    // Check if error
    if ( $result === false ) {
        echo "Error: ". implode(' ', $q->errorInfo() );
        return false;
    }

   // REMOVED rowcount example, its a memory-hog

    // Process resulting rows
    while ( $rs = $q->fetch(PDO::FETCH_ASSOC) )
    {
        // $rval_array gets constructed by processing per row resultset $rs 
    }

    // return data
    return $rval;
}

An example of write-type DB function is:

function update_msgrcpt_rs($mail_id, $mail_rcpt, $flag)
{
    // If its a pending message, do not set the rs flag to 'v'
    $cur_msg_array = $this->get_message($mail_rcpt, $mail_id);
    $msg_status = $cur_msg_array[0];
    if ($msg_status['rs'] == 'p' && $flag == 'v') return true;

    $query = 'UPDATE msgrcpt SET rs=?'
        . ' WHERE mail_id=?'
        . ' AND rid=(SELECT id FROM maddr WHERE email=?)';

    $values = array($flag, $mail_id, $mail_rcpt);

    // Prepare query
    $q = $this->db->prepare($query);
    // Execute query
    $result = $this->db->execute($q, $values);
    // Check if error
    $this->check_for_error($result, $query);

    return true;
}

This function would be then rewritten as PDO like this:

function update_msgrcpt_rs_pdo($mail_id, $mail_rcpt, $flag)
{
    // If its a pending message, do not set the rs flag to 'v'
    $cur_msg_array = $this->get_message($mail_rcpt, $mail_id);
    $msg_status = $cur_msg_array[0];
    if ($msg_status['rs'] == 'p' && $flag == 'v') return true;

    $query = 'UPDATE msgrcpt SET rs=?'
        . ' WHERE mail_id=?'
        . ' AND rid=(SELECT id FROM maddr WHERE email=?)';

    $values = array($flag, $mail_id, $mail_rcpt);

    // Prepare query
    $q = $this->db->prepare($query);
    // Execute query
    $result = $q->execute($values);
    // Check if error 
    if ( $result === false ) {
        echo "Error: ". implode(' ', $q->errorInfo() );
        return false;
    }

    return true;
}
Corfiot commented 3 years ago

I'm keeping an eye on this, too. I can't currently do it myself due to being swamped with other things but I'll contribute fixes if I see anything on the way. It seems like a more or less straightforward task.

E.g. in the write segment I'd recommend leaving error checking as it is and replacing the $this->check_for_error call with return $this->check_for_error($result,$query); and setting

function check_for_error($res,$q) {
  if ( $res === false) {        
    echo "Error: ". implode(' ', $q->errorInfo() );
  }
  return $res; //just in case a count or something is returned
}

Makes more sense to centralize error management, should be a logger there instead of "echo" in your face.

Corfiot commented 3 years ago

Looking at get_some_database_data_pdo() it is so memory hungry, loading everything twice, doesn't amavis have some huge queries? Does this currently actually work?

gnanet commented 3 years ago

Looking at get_some_database_data_pdo() it is so memory hungry, loading everything twice, doesn't amavis have some huge queries? Does this currently actually work?

Oh, NO! Do Not copy-paste these codes please.

The PDO variants were written only to demonstrate the similarities and possible code-migrations, but i never intended to use them unconditionally or without any prior performance-checks.

What you experienced was warned about in the php-docs, using fetchAll() is a memory-hog, so beter skip that.

gnanet commented 3 years ago

@Corfiot i edited my comment to warn others about the codes being not 1:1 usable, or even tested.

Did not expect that someone will see this thread so quick, just planned to use these notes to implement a database-class myself in the coming weeks.

gnanet commented 3 years ago

The rewrite of the database related classes raise the issue that lot of classes have each other "cross include_once()-ed", so a reimplementation how all the classes are loaded, and used is also on plan

SoniXtrA commented 3 years ago

Done a bit research, some coderefactoring could be done using rector-php, but replacing the pear/db part with pdo cannot be done in the same/similar automatized way.

But actually most of the calls are almost similar:

function db_connect() is mainly this with PEAR/DB:

$dsn = $this->dbType . '://' . $this->dbUser . ':' . $this->dbPass. '@' . $this->dbHost . '/' . $this->dbName;
$db = DB::connect($dsn, true);
if (DB::isError($db)) {
    echo $db->getMessage();
}
$db->connection->set_charset('utf8');
$db->setFetchMode(DB_FETCHMODE_ASSOC);
return $this->db = $db;

which would be written with PDO:

try {
    $dsn = $this->dbType . ':host=' . $this->dbHost . ';dbname=' . $this->dbName;
    $db = new PDO($dsn, $this->dbUser, $this->dbPass);
} catch (PDOException $e) {
    echo 'Connection failed: ' . $e->getMessage();
}
$db->exec("set names utf8");
$db->setAttribute(PDO::ATTR_DEFAULT_FETCH_MODE,PDO::FETCH_ASSOC);
return $this->db = $db;

Above structure shows, it would be possible to rewrite the database layer, the most important calls would be:

$this->db->prepare($query);
$this->db->execute($q);
$this->db->execute($q, $values);

lets see if i can manage this

I started yesterday to rewrite database access in the exactly same way as written with PEAR. => DBAuth seems to be fully functional (except logs in files through "check_for_error" function) => DBEngine is ok for parsing quarantined mails or to read a mail to check headers ... But, ATM, I can't delete or release a mail. Thus, the "check_for_error function" is not fully functional to correctly check what's going wrong in file log ... And i'm not a developer :), I just want to help if I can ...

Regards, Sxa

gnanet commented 3 years ago

@SoniXtrA check_for_error is not more than a check on the boolean for example $result = $q->execute(); then you have $result true/false, and in case the $result is FALSE, the errorInfo() is printed/logged etc.

"Delete" is mainly an SQL UPDATE, to set the RS status "Release" needs connection to Amavis over TCP, and an SQL UPDATE to set the RS status

gnanet commented 3 years ago

@Corfiot - my fault, i did not read the docs correctly, PDOStatement class has one property $queryString so i did not need to pass more than 2 parameters. Fixed function below

I'm keeping an eye on this, too. I can't currently do it myself due to being swamped with other things but I'll contribute fixes if I see anything on the way. It seems like a more or less straightforward task.

E.g. in the write segment I'd recommend leaving error checking as it is and replacing the $this->check_for_error call with return $this->check_for_error($result,$query); and setting

function check_for_error($res,$q) {
  if ( $res === false) {        
    echo "Error: ". implode(' ', $q->errorInfo() );
  }
  return $res; //just in case a count or something is returned
}

Makes more sense to centralize error management, should be a logger there instead of "echo" in your face.

I started to write the DBEngine.class.php, and i realized, the check_for_error() needs indeed a better rewrite, so here is my approach:

    /**
     * Checks to see if there was a database error, log in file and die if there was
     * @param bool $result result boolean
     * @param object $q statement object
     */
    function check_for_error($result, $q)
    {
        global $conf;
        if ( $result === false ) {
            $err_array = $q->errorInfo();
            $this->err_msg = 'Error['.$err_array[1].']: '.$err_array[2].' SQLSTATE='.$err_array[0];
            CmnFns::write_log($this->err_msg, $_SESSION['sessionID']);
            CmnFns::write_log('There was an error executing your query' . ' '
                . $q->queryString, $_SESSION['sessionID']);
            CmnFns::do_error_box(translate('There was an error executing your query') . '<br />'
                . $this->err_msg
                . '<br />' . '<a href="javascript: history.back();">' . translate('Back') . '</a>');
        } else {
            if ($conf['app']['debug']) {
                CmnFns::write_log("[DEBUG SQL QUERY]: ".$q->queryString);
            }

        }
        return false;
    }
gnanet commented 3 years ago

Did Started my rewrite for PDO, and done a bit rector upgrade to PHP 7.4 in fe8366c754d75c028ccc4c07d5664439c96de8ec

Please be careful this is untested code! have a look at the branch mailzu-ng-rector Please be careful this is untested code!

But i could test only on a system with mailzu on it where i have only PHP7.2, so i have some backward changes, but also some coding fixes in DBEngine soon.

gnanet commented 3 years ago

There are lot of fixes, and i was able to get behind the issue of not Deleting/Releasing reported by @SoniXtrA some PEAR/DB stuff lett in lib/Quarantine.lib.php, and error reporting was switched off in messagesProcessing.php thats why there was no error-log. im on it tofix those bugs

SoniXtrA commented 3 years ago

There are lot of fixes, and i was able to get behind the issue of not Deleting/Releasing reported by @SoniXtrA some PEAR/DB stuff lett in lib/Quarantine.lib.php, and error reporting was switched off in messagesProcessing.php thats why there was no error-log. im on it tofix those bugs

Wow :) :) :) lots of thanks ! => I will test rector branch tomorrow on my debian. Tell me if you prefer I wait for a more stable release or what you want I could test ... really thanks for everything !

gnanet commented 3 years ago

@SoniXtrA current state of branch mailzu-ng-rector contains hopefully all important changes for switching to PDO. The rector-php changes to switch PHP7.4 contain non-backwards-compatible class variable typings, and utilizes the is_countable() function, for my tests i had to revert the var ... declarations, and the is_countable changes.

Lucky me: The test was done on a PHP7.2 that had no pear initially installed, so i can also tell whats still required from PEAR to function properly. Also keep in mind to have configured safeMode to 0 that avoids the use of the bundled pear stuff.

$conf['app']['safeMode'] = 0; 

To install required PEAR modules as root:

pear install Mail_mimeDecode
pear install Net_Socket

As a result you will get these pear modules: Mail_Mime ( includes Mail_mimePart ) Mail_mimeDecode Net_Socket

SoniXtrA commented 3 years ago

Hi,

So I tried to use "mailzu-ng-rector"... => new config file modified with my DB infos and pear modules already installed. Accessing mailzu web page give me many errors concerning some PHP variables in "Pager.class.php", declared as followed ; ` // Application set variables var ?int $cur_page = null; var $query_string; var $tot_pages; var ?string $page_var = null; var ?string $limit_var = null;

// Application variables with user modify option
var ?int $limit = null;
var ?int $tot_records = null;
var bool $print_limit_select = true;

// User modifiable variables
var string $prev_link = '&laquo;';
var string $next_link = '&raquo;';
var $limits = array(10, 25, 50, 100);
var int $view_pages = 3;
var string $table_width = '100%';
var string $table_align = 'center';
var ?string $link_class = null;
var ?string $tb_class = null;
var ?string $tb_style = null;
var ?string $text_class = null;
var ?string $text_style = null;

` So removed every "?" and reloaded the page ... and it's seems to be ok

I tried to access a quarantined mail and tried to delete it. I got the following error : There was an error executing your query: Error[1242]: Subquery returns more than 1 row SQLSTATE=21000

I tried to release a message and I got the same error ...

For deleting a message, in mailzu.log, I got the following trace : [DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.secret_id, msgs.subject, msgs.from_addr, msgs.spam_level, msgrcpt.rs, recip.email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc FROM msgs INNER JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE recip.email=? AND msgs.mail_id=? [DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.secret_id, msgs.subject, msgs.from_addr, msgs.spam_level, msgrcpt.rs, recip.email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc FROM msgs INNER JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE recip.email=? AND msgs.mail_id=? Error[1242]: Subquery returns more than 1 row SQLSTATE=21000 There was an error executing your query UPDATE msgrcpt SET rs=? WHERE mail_id=? AND rid=(SELECT id FROM maddr WHERE email=?)

Hope this will help you

EDIT : image

All the best SxA

gnanet commented 3 years ago

@SoniXtrA these errors i got with PHP7.2 so i will push a branch tonight, which i used to test.

Reading your eadit, i just said: " i knew i it, oh yes" :D

Corfiot commented 3 years ago

Guys, https://www.php.net/supported-versions.php We should be doing 7.4+. If not 8.0

gnanet commented 3 years ago

If you check the recent changelog, you will see, the goal is at least 7.4, but as it contains some small not backward compatible changes, and i can tell you from experience, that there are still many mailservers which are not that up-to-date, and an upgrade is not easily possible, for those systems the most recent PHP is 7.2 or 7.3.

This project is not a success because of being bleedingedge, but because it still works. I have no problems having a pre7.4 and a 7.4+ 8+; in parallel.

But i would ask you to test this branch and provide feesback, patxhes that were done on 7.4 +

gnanet commented 3 years ago

@SoniXtrA i have to apologize. I am organizing the issues, and have read the starter description of your issue again, and you clearly stated your php version was 7.3. I was wrong to suggest you the 7.4 branch. I will upload my 7.3 compatible branch as soon as i get home.

gnanet commented 3 years ago

@SoniXtrA did some tests to start over with rector to make changes to php 7.3 instead and compared to the mostly working code that rector proposed changeset would possibly cause problems. So i really will need to get home and do my refactoring by hand

SoniXtrA commented 3 years ago

@SoniXtrA i have to apologize. I am organizing the issues, and have read the starter description of your issue again, and you clearly stated your php version was 7.3. I was wrong to suggest you the 7.4 branch. I will upload my 7.3 compatible branch as soon as i get home.

No worry, really 😊👍

gnanet commented 3 years ago

@SoniXtrA please grab the branch mailzu-ng-php72 and try to test it

SoniXtrA commented 3 years ago

@SoniXtrA please grab the branch mailzu-ng-php72 and try to test it

Sorry I made a mistake, I closed the issue but it wasn't what I want to do :( So I tested the new branch...

with the following logs : [DEBUG SQL QUERY]: [DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.secret_id, msgs.subject, msgs.from_addr, msgs.spam_level, msgrcpt.rs, recip.email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc FROM msgs INNER JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE recip.email=? AND msgs.mail_id=? [DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.secret_id, msgs.subject, msgs.from_addr, msgs.spam_level, msgrcpt.rs, recip.email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc FROM msgs INNER JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE recip.email=? AND msgs.mail_id=? Error[1242]: Subquery returns more than 1 row SQLSTATE=21000 There was an error executing your query UPDATE msgrcpt SET rs=? WHERE mail_id=? AND rid=(SELECT id FROM maddr WHERE email=?)

If I want to release it, I get the same error : image

with the following logs : `[DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.from_addr, msgs.mail_id, msgs.subject, msgs.spam_level, msgs.content, msgrcpt.rs, msgs.quar_type, recip.email FROM msgs INNER JOIN msgrcpt ON msgs.mail_id = msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid = sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid = recip.id WHERE msgs.content in ('S', 'B', 'V', 'H') AND msgrcpt.rs in ('', 'v')

                    AND msgs.quar_type <> ''
                    ORDER BY msgs.time_num DESC

[DEBUG SQL QUERY]: SELECT CONVERT(mail_text USING utf8) AS mail_text FROM quarantine WHERE mail_id=? [DEBUG SQL QUERY]: [DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.secret_id, msgs.subject, msgs.from_addr, msgs.spam_level, msgrcpt.rs, recip.email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc FROM msgs INNER JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE recip.email=? AND msgs.mail_id=? [DEBUG SQL QUERY]: SELECT msgs.time_num, msgs.secret_id, msgs.subject, msgs.from_addr, msgs.spam_level, msgrcpt.rs, recip.email, msgs.host, msgs.content, msgs.quar_type, msgs.quar_loc FROM msgs INNER JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid=sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid=recip.id WHERE recip.email=? AND msgs.mail_id=? Error[1242]: Subquery returns more than 1 row SQLSTATE=21000 There was an error executing your query UPDATE msgrcpt SET rs=? WHERE mail_id=? AND rid=(SELECT id FROM maddr WHERE email=?) `

Edit : When I tried to release a quarantined email, I got the "SQLSTATE=21000" error... but, and I haven't seen it, the mail has been released anyway. => It's not true for deleting a quarantined email where the error occurs and the email is not deleted.

gnanet commented 3 years ago

I will check the sql query closer, but my first ide would be to look into the maddr table, and check if you may have more that one entry for the email address, which is an unexpected state of the maddr table. I will try to find the detailed description why there should be only 1 row per mailaddress

gnanet commented 3 years ago

Yes i remember that there should only be one result for one e-mail adress:

https://www.ijs.si/software/amavisd/README.sql-mysql.txt

provide unique id for each e-mail address, avoids storing copies

CREATE TABLE maddr (
  partition_tag integer      DEFAULT 0, -- see $partition_tag
  id         bigint unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
  email      varbinary(255)  NOT NULL,  -- full mail address
  domain     varchar(255)    NOT NULL,  -- only domain part of the email address
                                        -- with subdomain fields in reverse
  CONSTRAINT part_email UNIQUE (partition_tag,email)
) ENGINE=InnoDB;
SoniXtrA commented 3 years ago

Yes i remember that there should only be one result for one e-mail adress:

https://www.ijs.si/software/amavisd/README.sql-mysql.txt

provide unique id for each e-mail address, avoids storing copies

CREATE TABLE maddr (
  partition_tag integer      DEFAULT 0, -- see $partition_tag
  id         bigint unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
  email      varbinary(255)  NOT NULL,  -- full mail address
  domain     varchar(255)    NOT NULL,  -- only domain part of the email address
                                        -- with subdomain fields in reverse
  CONSTRAINT part_email UNIQUE (partition_tag,email)
) ENGINE=InnoDB;

I think you put the finger on the problem ... I have duplicates email in "email" field of my "maddr" table Here is the result of the SHOW CREATE TABLE maddr; for my "maddr" table ... CREATE TABLEmaddr( partition_tagint(11) DEFAULT 0, idbigint(20) unsigned NOT NULL AUTO_INCREMENT, emailvarbinary(255) NOT NULL, domainvarchar(255) NOT NULL, PRIMARY KEY (id), UNIQUE KEYpart_email(partition_tag,email) ) ENGINE=InnoDB AUTO_INCREMENT=221911 DEFAULT CHARSET=latin1

gnanet commented 3 years ago

Interesting is, that your mysql does not yell at you because you have UNIQUE KEY part_email (partition_tag,email) which means the combination of the two fields have to be unique in your table: 0admin@YOURDOMAIN.TLD

to be honest i cant imagine how that was possible to insert, but remove the duplicates one-by-one with

DELETE FROM maddr WHERE id=<duplicatedid>

also if you had inconsistencies there, you might have also such hiccups on other tables too.

I have some maintenance SQL queries, which clean out garbage from the amavis tables, but that was also not constructed for clean up inconsistent data.

gnanet commented 3 years ago

Before you try any of the below queries, MAKE BACKUP OF THE AMAVIS DB

Cleaning up deleted and not referenced items:

DELETE msgs FROM msgs LEFT JOIN msgrcpt ON msgrcpt.mail_id = msgs.mail_id WHERE msgrcpt.rs = 'D';
DELETE FROM quarantine WHERE mail_id IN (SELECT mail_id FROM msgrcpt WHERE rs = 'D');
DELETE FROM msgrcpt WHERE rs = 'D';
DELETE FROM maddr WHERE id NOT IN (SELECT sid FROM msgs) AND id NOT IN (SELECT rid FROM msgrcpt);
SoniXtrA commented 3 years ago

Ok Backup done ! I deleted every duplicates in "maddr" table I run your 4 SQL Delete => no error After that, I removed an index named "part_email" to be able to run the following SQL request to add constraint ALTER TABLE maddr ADD CONSTRAINT part_email UNIQUE (partition_tag,email); => no error.

The "SHOW CREATE TABLE maddr" give me : CREATE TABLEmaddr( partition_tagint(11) DEFAULT 0, idbigint(20) unsigned NOT NULL AUTO_INCREMENT, emailvarbinary(255) NOT NULL, domainvarchar(255) NOT NULL, PRIMARY KEY (id), UNIQUE KEYpart_email(partition_tag,email) ) ENGINE=InnoDB AUTO_INCREMENT=221920 DEFAULT CHARSET=latin1

So now, I try use "mailzu-mailzu-ng-php72" or "mailzu-mailzu-ng-rector" and the good news is there is no more SQL error when trying to delete or release a quarantined email. But... when I delete a quarantined email, there is no error but there is no action :) :) :) because the quarantined email is still in the list ... I tried to delete all the quarantined email and I still got the SQL Error "Error[1242]: Subquery returns more than 1 row SQLSTATE=21000" So I tried to find duplicates in "maddr" to check ... and.... there are new duplicates :( So instead of the constraint, duplicates continue to be written in the table :(

I don't understand ... :/

SoniXtrA commented 3 years ago

I stopped amavis, I exported datas form maddr to sql file. I created a new table maddr_2 from "https://www.ijs.si/software/amavisd/README.sql-mysql.txt" I imported datas from file to maddr_2 I renamed maddr to maddr_old and maddr_2 to maddr I started amavis ... And new duplicates can still be inserted in table :( Really I don't know what's happen ..

gnanet commented 3 years ago

Do you see any repetition behind the duplicated email-s in the madr table? I wonder if there is some minor misconfiguration in the amavis sql storage setup, and maybe the duplicates are written by amavis itself. Check if you have configured some $sql_partition_tag = in your amavis configs

also you can check for yourself if you did some unwanted partitioning somewhere

SELECT id,partition_tag,email FROM maddr ORDER BY email ASC
gnanet commented 3 years ago

Reading on, i found this: https://github.com/srault95/amavisd-new/blob/master/RELEASE_NOTES#L3538

Where the code suggest to hardcode the partition_tag into the query:

the old query

(SELECT id FROM maddr WHERE email=?)

would become

(SELECT id FROM maddr WHERE partition_tag=0 AND email=?)

everywhere.

gnanet commented 3 years ago

Try to use latest mailzu-ng-php72 branch

gnanet commented 3 years ago

also ensure you have set this to zero in amavis config

$sql_store_info_for_all_msgs = 0;
SoniXtrA commented 3 years ago

Do you see any repetition behind the duplicated email-s in the madr table? I wonder if there is some minor misconfiguration in the amavis sql storage setup, and maybe the duplicates are written by amavis itself. Check if you have configured some $sql_partition_tag = in your amavis configs

also you can check for yourself if you did some unwanted partitioning somewhere

SELECT id,partition_tag,email FROM maddr ORDER BY email ASC

In my "amavisd.conf" I have : $sql_partitiontag = sub { my($msginfo)=@; iso8601_week($msginfo->rx_time) };

in "maddr" : partition_tag goes from "2" to "44" id goes from "207005" to "221941" and email field contains email addresses from 12 characters to 93 ... but they seemed to be normally

When I check for duplicates : SELECT COUNT(*) AS nbr_doublon, partition_tag, id, email, domain FROM maddr GROUP BY email HAVING COUNT(*) > 1

=> everything seems to be correctly formated image

I use "mailzu-ng-php72" (previously I tried rector version just to check if the behavior was the same)

And "$sql_store_info_for_all_msgs = 0;" was no defined in my conf ... now it's done and amavisd restarted.

SoniXtrA commented 3 years ago

FYI : column "To" (Mailzu-ng-php72) only contains "(none)"... image

gnanet commented 3 years ago

Your problem is caused by this:

$sql_partition_tag = sub { my($msginfo)=@_; iso8601_week($msginfo->rx_time) };

Read: https://github.com/srault95/amavisd-new/blob/master/RELEASE_NOTES#L3498

Amavisd process itself does not use the 'partition_tag' field for its own purposes, all records regardless of their 'partition_tag' value are available for example to pen pals lookups, as before.

gnanet commented 3 years ago

As you do not have the same email address once, but multiple times, to determine destination, we would need to retrieve the partition_tag for the messages tobe listed, and from there, we would have to provide the same partition_tag for release and delete of messages too.

SoniXtrA commented 3 years ago

Your problem is caused by this:

$sql_partition_tag = sub { my($msginfo)=@_; iso8601_week($msginfo->rx_time) };

Read: https://github.com/srault95/amavisd-new/blob/master/RELEASE_NOTES#L3498

Amavisd process itself does not use the 'partition_tag' field for its own purposes, all records regardless of their 'partition_tag' value are available for example to pen pals lookups, as before.

So, as mentioned, I set the "sql_partition_tag" to 0 and I have to run the following : ALTER TABLE maddr ADD partition_tag integer DEFAULT 0; ALTER TABLE msgs ADD partition_tag integer DEFAULT 0; ALTER TABLE msgrcpt ADD partition_tag integer DEFAULT 0; ALTER TABLE quarantine ADD partition_tag integer DEFAULT 0;

and

=> MySQL: ALTER TABLE maddr DROP KEY email, ADD UNIQUE KEY part_email (partition_tag,email);

?

gnanet commented 3 years ago

OK, give me some time, i will have to rewrite multimple parts of the code, because if i am right, msgs contains the partition_tag, and as such, i can retrieve it for use in the other queries, and in the release request.

SoniXtrA commented 3 years ago

OK, give me some time, i will have to rewrite multimple parts of the code, because if i am right, msgs contains the partition_tag, and as such, i can retrieve it for use in the other queries, and in the release request.

yes, of course... But do I have to replace every values of "partition_tag" by a "0" in every tables maddr, msgs, msgrcpt and quarantine ? And delete duplicates ...

gnanet commented 3 years ago

Your problem is caused by this:

$sql_partition_tag = sub { my($msginfo)=@_; iso8601_week($msginfo->rx_time) };

Read: https://github.com/srault95/amavisd-new/blob/master/RELEASE_NOTES#L3498

Amavisd process itself does not use the 'partition_tag' field for its own purposes, all records regardless of their 'partition_tag' value are available for example to pen pals lookups, as before.

NO: because you already have the column for partition_Tag, is its not defined to be default 0 maybe you just have to change that..

So, as mentioned, I set the "sql_partition_tag" to 0 and I have to run the following : ALTER TABLE maddr ADD partition_tag integer DEFAULT 0; ALTER TABLE msgs ADD partition_tag integer DEFAULT 0; ALTER TABLE msgrcpt ADD partition_tag integer DEFAULT 0; ALTER TABLE quarantine ADD partition_tag integer DEFAULT 0;

and

NO: This part is already in effect for you exaclty because you have different partitioning for the same email.

=> MySQL: ALTER TABLE maddr DROP KEY email, ADD UNIQUE KEY part_email (partition_tag,email);

?

As i said before, i have looked on the code, and if my assumption is right, you have the same partition_tag in msgs and maddr for a particular message.

You could help me if you check is msgs table defines the same partition_tag as it is stored for the connected email

gnanet commented 3 years ago

OK, give me some time, i will have to rewrite multimple parts of the code, because if i am right, msgs contains the partition_tag, and as such, i can retrieve it for use in the other queries, and in the release request.

yes, of course... But do I have to replace every values of "partition_tag" by a "0" in every tables maddr, msgs, msgrcpt and quarantine ? And delete duplicates ...

NO :D Just help me to ensure the msgs part_id refers to the same as in maddr for a message and its connected email-addresses

gnanet commented 3 years ago

The bits and pieces of implementing partition_tag aware message / address / release handling

not the complete list, but enough to get a picture about the task

https://www.ijs.si/software/amavisd/README.protocol.txt

request=release (or request=requeue, available since 2.5.1) mail_id=xxxxxxxxxxxx secret_id=xxxxxxxxxxxx (authorizes a release) partition_tag=xx (optional, but recommended if info is available)

https://github.com/gnanet/mailzu/blob/mailzu-ng-php72/lib/DBEngine.class.php#L420

    $query = "SELECT
      msgs.time_num,
      msgs.from_addr,
      msgs.mail_id,
      msgs.partition_tag,

https://github.com/gnanet/mailzu/blob/mailzu-ng-php72/lib/DBEngine.class.php#L510

$query = 'SELECT msgs.time_num, msgs.secret_id, msgs.partition_tag,

https://github.com/gnanet/mailzu/blob/mailzu-ng-php72/lib/DBEngine.class.php#L557

$partition_tag = isset($msg_status['partition_tag']) ? $msg_status['partition_tag'] : 0;

$query = 'UPDATE msgrcpt SET rs=?' . ' WHERE mail_id=?' . ' AND rid=(SELECT id FROM maddr WHERE partition_tag=? AND email=?)';

$values = array($flag, $mail_id, $partition_tag, $mail_rcpt);

gnanet commented 3 years ago

@SoniXtrA its time for a new test with the mailzu-ng-php72 branch. last commit was: 1222268

SoniXtrA commented 3 years ago

Your problem is caused by this:

$sql_partition_tag = sub { my($msginfo)=@_; iso8601_week($msginfo->rx_time) };

Read: https://github.com/srault95/amavisd-new/blob/master/RELEASE_NOTES#L3498

Amavisd process itself does not use the 'partition_tag' field for its own purposes, all records regardless of their 'partition_tag' value are available for example to pen pals lookups, as before.

NO: because you already have the column for partition_Tag, is its not defined to be default 0 maybe you just have to change that..

So, as mentioned, I set the "sql_partition_tag" to 0 and I have to run the following : ALTER TABLE maddr ADD partition_tag integer DEFAULT 0; ALTER TABLE msgs ADD partition_tag integer DEFAULT 0; ALTER TABLE msgrcpt ADD partition_tag integer DEFAULT 0; ALTER TABLE quarantine ADD partition_tag integer DEFAULT 0; and

NO: This part is already in effect for you exaclty because you have different partitioning for the same email.

=> MySQL: ALTER TABLE maddr DROP KEY email, ADD UNIQUE KEY part_email (partition_tag,email); ?

As i said before, i have looked on the code, and if my assumption is right, you have the same partition_tag in msgs and maddr for a particular message.

You could help me if you check is msgs table defines the same partition_tag as it is stored for the connected email

Hi for this morning here :) I'm not a dude in SQL ... :( but I wrote : SELECT msgs.mail_id, msgs.partition_tag, maddr.partition_tag from msgs, maddr where maddr.id = msgs.sid and msgs.partition_tag != maddr.partition_tag

and the result is an empty result :) (0 rows)

So, on my side, the only thing I did on my Amvisd conf is to replace the $sql_partition_tag = sub { my($msginfo)=@_; iso8601_week($msginfo->rx_time) }; by $sql_partition_tag = 0; and, I've added the following $sql_store_info_for_all_msgs = 0;

now I will try the release of the last "mailzu-ng-php72"

gnanet commented 3 years ago

You can even keep your patition_tag setting if i did the right changes. Just test it

SoniXtrA commented 3 years ago

So... Now when I enter in "Site Quarantine", the column "TO" show "(none)" for old rows and my email for the recent rows. I can delete those with my email in "TO" column and I'm unable to delete those with "(none")...

If I run the SQL I saw in mailzu debug log, SELECT msgs.time_num, msgs.from_addr, msgs.mail_id, msgs.partition_tag, msgs.subject, msgs.spam_level, msgs.content, msgrcpt.rs, msgs.quar_type, recip.email FROM msgs INNER JOIN msgrcpt ON msgs.mail_id = msgrcpt.mail_id LEFT JOIN maddr AS sender ON msgs.sid = sender.id LEFT JOIN maddr AS recip ON msgrcpt.rid = recip.id WHERE msgs.content in ('S', 'B', 'V', 'H') AND msgrcpt.rs in ('', 'v') AND msgs.quar_type <> '' ORDER BY msgs.time_num DESC

I get a result with on the last column, recent rows with my email and old with NULL

image

gnanet commented 3 years ago

Actually striking trough your partition_tag settings was a bad idea