matidau / Z-Push

PHP8 support and other bug fixes for Z-Push in mailinabox
http://z-push.org
GNU Affero General Public License v3.0
5 stars 0 forks source link

20K+ Inbox disappears and needs to be resynced #3

Closed matidau closed 1 year ago

matidau commented 1 year ago

I still find that my Inbox (which has 20k+ emails), disappears and needs to be resynced with a few different z-push php 8 branches I've tried, this branch included.

Originally posted by @jvolkenant in https://github.com/mail-in-a-box/mailinabox/issues/2236#issuecomment-1438942194

jvolkenant commented 1 year ago

Hey, I appreciate you furthering work on Z-Push to make it work again on Mail-in-a-Box.

Over at https://github.com/jvolkenant/ I've tried a few combinations of patches found for PHP8.0 but didn't seem to reliably let syncing work. I even did some testing running a docker container of 22.04 running Zpush 2.6.4 + PHP 7.4 (focal has php 7.4 and LTS support for a few years longer until a perm fix like what you are working on was found) unfortunately the same sync issues arise.

Since 14.04 through 18.04 I've used Nine on Android with Z-push.

I've done the following:

Removed state files and logs

rm -rf /var/lib/z-push/*
rm -rf /var/log/z-push/*

Ran Nine for Android via wifi, chose Exchange Activesync for account type, provided username/password. Chose to sync Email, Contacts, Calendar. Things seem to work initially where, it appears to sync the Inbox emails. Calendar and contacts don't always auto sync from the start, I usually have to go back in to the settings, re-enable the sync for calendar and contacts and refresh, but it's hit and miss if they do sync or stay synced. I think it has to do with the email's failing to stay synced.

However, usually some time, the list of emails will disappear for the inbox on Nine and it will show it syncing emails again with a progress bar. I'll usually get a message that will say the inbox is empty or that Nine is waiting on a sync.

Ran fixstates

root@m:/usr/local/lib/z-push# php8.0 /usr/local/lib/z-push/z-push-admin.php -a fixstates

Validating and fixing states (this can take some time):
        21:03:58 Checking username casings: Processed: 1 - Converted: 0 - Removed: 0
        21:03:58 Checking available devicedata & user linking: Processed: 1 - Fixed: 0
        21:03:58 Checking for unreferenced (obsolete) state files: Processed: 16 - Deleted: 1
        21:03:58 Checking for hierarchy folder data state: Devices: 1 - Processed: 1 - Fixed: 0 - Device+User without hierarchy: 0
        21:03:58 Checking flags of shared folders: Devices: 1 - Devices with additional folders: 0 - Fixed: 0

swiped down to sync Inbox, Inbox now shows up. Ran sync for calendar, calendar shows up. Ran sync for contacts, the contacts didn't sync, and now the Inbox and calendars are empty again.

Ran fixstates and clearloop

root@m:/usr/local/lib/z-push# php8.0 /usr/local/lib/z-push/z-push-admin.php -a fixstates

Validating and fixing states (this can take some time):
        21:09:12 Checking username casings: Processed: 1 - Converted: 0 - Removed: 0
        21:09:13 Checking available devicedata & user linking: Processed: 1 - Fixed: 0
        21:09:13 Checking for unreferenced (obsolete) state files: Processed: 15 - Deleted: 1
        21:09:13 Checking for hierarchy folder data state: Devices: 1 - Processed: 1 - Fixed: 0 - Device+User without hierarchy: 0
        21:09:13 Checking flags of shared folders: Devices: 1 - Devices with additional folders: 0 - Fixed: 0

root@m:/usr/local/lib/z-push# php8.0 /usr/local/lib/z-push/z-push-admin.php -a clearloop

System wide loop detection data removed: OK

Retried sync, same sort of issue as above, Inbox content on Nine disappears, and it tries re-syncing.

Logs of of the above can be found here https://m.scentoflime.com/zpush-logs-2236-02212023-0912pm.tar.gz

Likely I'll need to do the same process but with LOGLEVEL_WBXML https://github.com/matidau/Z-Push/blob/php8.x/src/config.php#L115 for some more useful analysis.

I haven't tried other Activesync clients for a while, and I don't remember the results, but I don't think it made a difference (or I would probably be using the client that did work)

I run MIAB on a cloud VM that is not very fast (1vcpu 2gb ram, SSD, but I only get ~150MB/s). When I did some troubleshooting back when v60 came out, running with LOGLEVEL_WBXML showed what seemed like maybe an IMAP lookup for all mails https://github.com/matidau/Z-Push/blob/php8.x/src/backend/imap/imap.php#L1040-L1047

And that was taking longer than some timeout, maybe https://github.com/matidau/Z-Push/blob/php8.x/src/config.php#L256 but honestly I don't know.

I tried moving out a bunch of mails and only leaving like 1000 in the Inbox, but it still had the same issues as the above.

matidau commented 1 year ago

Thank you !

I did test your repo along with others before settling on cbren's one, was mostly error free from my tests (just needed a policy change).

Your branch https://github.com/jvolkenant/Z-Push/tree/php8.x-testing was the second most promising, hope this is the one that you also felt should be looked at. Before seeing your comment I also didn't realise you were a MIAB contributor or otherwise I would have reached out.

On your specs, I don't think this is an issue, I'm running on an AWS t2.micro, 1 vcpu and 1GB ram, this can burst a bit but would likely be the same.

Your info has given me some food for thought, I believe that there are tolerance levels coded in since 2.6.x that when things are out of sync it launches a resync itself, but I have no idea how this would look to an end user or if it is occurring as intended. I'll start with replicating with the large inbox and using Nine.

Will let you know how I go.

jvolkenant commented 1 year ago

I've been able to keep email and calendar synced overnight, when I tried syncing the contacts I started getting the sync issues I described above.

I'm in the process of setting up another domain and making a copy of my emails, calendar&contacts there so I have a dedicated testing setup.

I'd like to try with a fresh Nextcloud and import an ics/vcard of my data just to rule that out

matidau commented 1 year ago

That's great to hear.

I've set myself up with a large inbox 8K+ emails, and some contacts in a test account on my production box. I've tried to test with Nine for this but it is producing some odd behaviour.

24/02/2023 12:17:08 [193797] [WARN] [zpushtest@mat.id.au] BackendCalDAV->Logon(): User 'zpushtest@mat.id.au' is not authenticated on CalDAV 'https://127.0.0.1:443/caldav/calendars/zpushtest@mat.id.au/'
24/02/2023 12:17:08 [193797] [FATAL] [zpushtest@mat.id.au] Exception: (AuthenticationRequiredException) - Access denied. Username or password incorrect
24/02/2023 12:17:08 [193316] [WARN] [zpushtest@mat.id.au] BackendCalDAV->Logon(): User 'zpushtest@mat.id.au' is not authenticated on CalDAV 'https://127.0.0.1:443/caldav/calendars/zpushtest@mat.id.au/'
24/02/2023 12:17:08 [193316] [FATAL] [zpushtest@mat.id.au] Exception: (AuthenticationRequiredException) - Access denied. Username or password incorrect
24/02/2023 12:17:08 [193313] [WARN] [zpushtest@mat.id.au] BackendCalDAV->Logon(): User 'zpushtest@mat.id.au' is not authenticated on CalDAV 'https://127.0.0.1:443/caldav/calendars/zpushtest@mat.id.au/'
24/02/2023 12:17:08 [193313] [FATAL] [zpushtest@mat.id.au] Exception: (AuthenticationRequiredException) - Access denied. Username or password incorrect

it looks like owncloud has banned the attempts to retrieve the Calendars (and cards) due to it requesting multiple syncs when it fails.

I'm going to let this settle down and remove my test account from nine and try it in gmail/google contacts/google calendar to see if I get the same problems.

Edit: I've found some errors in https://github.com/matidau/Z-Push/blob/php8.x/src/lib/request/sync.php that I think are more PHP8 related ones. I think I pushed my live z-push enough with Nine that I reached the tolerances for out of sync-ness. Will update when I've done more.

matidau commented 1 year ago

I've made an update that should help z-push self recover when it becomes out of sync.

If you can test this and see how it goes with you. Note I've made a separate zpushtesting.sh script for testing against v60+ curl -s https://mat.id.au/zpushtesting.sh | sudo bash

Also keen to see how you went otherwise.

jvolkenant commented 1 year ago

Thanks, giving it a spin now. Seemed more reliable during setup. Letting it bake for a few days.

cbren commented 1 year ago

We don't use mail in a box but do have several large (50k+) inboxes and use nine on some android clients. Haven't had any issues with mailboxes disappearing or losing sync.

Not sure if you've solved the BackendCalDAV login issue, but in our case a similar/same issue came up after php8.2 and the cause was sabre/dav (or one of it's dependencies) not z-push. Updating sabre/dav to master and all of the composer dependencies resolved the failed logins. If mail in box is using own cloud or similar to provide DAV it did/does use sabre/dav to provide those services. It may be somewhere to look if you're still having that issue.

matidau commented 1 year ago

Thanks for the info, it's confirmed what I've tracked this back to, Nextcloud, own cloud fork that mail-in-a-box uses.

Going to to a php-0.3 tag and then close this one off. I think we have solved the big issue with https://github.com/matidau/Z-Push/pull/4

🙂

jvolkenant commented 1 year ago

I've been testing https://github.com/matidau/Z-Push/releases/tag/2.6.4-php8-testing-0.3 for a few days now on two phones running Nine to the same account.

When I add a new account, mail and contacts typically syncs ok but calendar does not. I have to go back to the mail settings, choose to sync just the Personal calendar and sync. Sometimes when setting things up it works fine where mail, calendar and contacts work as expected. Other times when you add the calendar like that it the mail calendar and contacts would disappear and syncing would start all over again.

Taking a cue from cbren's post from https://github.com/matidau/Z-Push/issues/7 I pondered updating Nextcloud and it's plugins to try to see if updating would help with newer dependencies (I want to think that Nextcloud uses Sabre/DAV). I didn't want to upgrade to Nextcloud 24 yet ahead of upstream (although I don't see why we can't do it yet other than just getting a pr submitted upstream). But I did update the Nextcloud apps to their latest versions; contacts to 4.2.5 and calendar to 3.5.5 hoping things would help. Maybe it helps things a little. It couldn't hurt.

It wasn't until I dealt with these log entries that I think things have been working better.

03/03/2023 10:39:36 [168900] [ERROR] [name@example.com] BackendCardDAV->GetMessageList - Error getting the vcards in 'https://127.0.0.1:443/cloud/remote.php/carddav/addressbooks/name@example.com/z-app-generated--contactsinteraction--recent/': Woops, something's gone wrong! The CardDAV server returned the http status code 501.
03/03/2023 10:39:36 [168900] [ERROR] [name@example.com] BackendCardDAV->GetMessageList - Error getting the vcards
03/03/2023 10:39:40 [168556] [ERROR] [name@example.com] BackendCardDAV->ChangesSinkInitialize - Error doing the initial sync for 'https://127.0.0.1:443/cloud/remote.php/carddav/addressbooks/name@example.com/z-app-generated--contactsinteraction--recent/': Woops, something's gone wrong! The CardDAV server returned the http status code 501.

I came across this post suggesting to delete the contactinteraction app (or in my case I just disabled it). It looks to be a Nextcloud generated addressbook; but is read only and Hidden from the GUI. Since I don't use it and don't sync it I just went ahead and disabled it.

sudo -u www-data php8.0 occ app:disable contactsinteraction

I think things have been better, at least for my setup. I've only had these messages posted 2 days ago when I tested re-adding accounts and no log messages since then

05/03/2023 06:32:52 [400519] [ERROR] [name@example.com] BackendCardDAV->GetGALSearchResults : Error in search Woops, something's gone wrong! The CardDAV server returned the http status code 405.
05/03/2023 06:32:52 [400519] [ERROR] [name@example.com] BackendCardDAV->GetGALSearchResults : Error in search query. Search aborted

Thank you for your work on https://github.com/matidau/Z-Push/releases/tag/2.6.4-php8-testing-0.3 I think there is still some things to be fixed in Z-Push, but it's the best branch that's working for me so far.

matidau commented 1 year ago

I think I saw this error when nextcloud's brute force prevention kicked in.

The CardDAV server returned the http status code 405.

I think this occurred after the initial login rejections errors for me. It would be interesting what your nextcloud log has in it for the same time.

For the status 501 problem we could perhaps change how Z-push handles this to the same as 405 (or 403 forbidden). This is only a thought, it would take me a little to review what affect this might have in other non-nextcloud scenarios.

jvolkenant commented 1 year ago

I wonder if bruteforce protection needs to be disabled; It should be handled already by the fail2ban blocking. Alternatively whitelisting 127.0.0.1 might be a good idea. I see an app to do it https://github.com/nextcloud/bruteforcesettings but it would be nice to just add it to a config or a sqlite insert instead

jvolkenant commented 1 year ago

It might be enough to set trusted_proxies => ['127.0.0.1'] in the Nextcloud config.php https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/bruteforce_configuration.html https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/config_sample_php_parameters.html

Some requests are served directly by fastcgi on /cloud, the /(caldav|carddav|webdav) are proxied; I'm not sure if it's adding X-Forwarded-For headers there or not

matidau commented 1 year ago

Maybe something that can raise in the mail-in-a-box issues. I was aiming for minimal changes to make it easier to review, i.e. just pointing to this repo in the setup scripts.

But it might be worthwhile to give a better upgrade experience for ActiveSync admins and users.