etesync / etesync-dav

This is a CalDAV and CardDAV adapter for EteSync
https://www.etesync.com
GNU General Public License v3.0
285 stars 45 forks source link

10 second lag making calendar changes in Thunderbird #237

Open BlueDrink9 opened 2 years ago

BlueDrink9 commented 2 years ago

I am using Thunderbird, with TBSync syncing to etesync, to manage my calendars. Making any change to an event, or creating a new one, has a lag of between 5-10 seconds (it is huge!) before the change is visible in the calendar.

Forgive me if this is not the right repo for this issue. I am tearing my hair out trying to fix or diagnose this, and I would appreciate any help. Am I understanding correctly that TBSync should sync locally to the dav, and immediately show changes in thunderbird?

This has been happening since I started using etesync around March, so for many versions

Happens under both Windows and Linux (arch and manjaro).

Current setup:

Thunderbird 91.3.0 TBSync 3.0.1 Caldav provider 2.0.2 Etesync 0.30.8

I have tried using the logs, but it outputs a lot every time I make a change, and I don't want to post it publicly. The last line after changing an event reads

207 Multi-Status [2021-11-25 22:42:08 +1300] [19372/Thread-58] [INFO] HEAD response status for '/user/y_y/x.ics' in 6.197 seconds: 200 OK

BlueDrink9 commented 2 years ago

If it makes a difference, I am in New Zealand (presumably very far away from Etesync's servers)

nickbearman commented 2 years ago

I have the same issue, @BlueDrink9 you are not alone! I thought it was maybe a Thunderbird issue - but it could be EteSync. Following the Issue!

tasn commented 2 years ago

Interesting. We used to have this issue for people in NZ/Australia, but I thought we fixed it. The sync should now happen in the background so it shouldn't be blocking. I think it's also a Thunderbird issue (as in, they should just add the item locally and worst case change it later), but EteSync should be instantaneous. Returning after 6.197 seconds is not normal. :|

Now that I think about it, I wonder whether we broke it at some point for writes (definitely fixed it for reads).

BlueDrink9 commented 2 years ago

Yeah, I read the blog posts about the local sync change, and thought "Oh great, can't wait for this to be released". Then I realised the post was from over a year ago

BlueDrink9 commented 2 years ago

Let me know what steps I can take to help you diagnose

nickbearman commented 2 years ago

If it makes a difference, I am in New Zealand (presumably very far away from Etesync's servers)

For ref, I am in the UK, so presumably nearer to EteSync's servers?!

nickbearman commented 2 years ago

I am using Windows Current setup: Thunderbird 91.4.0 TBSync 3.0.1 Caldav provider (Provider for CalDAV & CardDAV) 2.0.2

Let me know if you need any more help for troubleshooting

nickbearman commented 2 years ago

Logging, I get this: [HTTP/1.0 207 Multi-Status 10895ms] So about 10 seconds.

BlueDrink9 commented 2 years ago

@tasn anything we can do to help get this sorted soon? It is truly maddening

Peter4487 commented 2 years ago

I also am experiencing this problem

Peter4487 commented 2 years ago

Attached are some logs generated using linux-amd64-etesync-dav -D |& tee linux-amd64-etesync-dav_logs5.txt (just in case anyone else is wondering how to turn on logging & send to file).

What I am doing at [2022-03-09 14:50:46 -0500] is creating an event in Thunderbird using the right-click on the calendar-->New Event method. Same delays if dragging on the calendar itself.

EteSync DAV version: 0.31.2 Radicale version: 3.0.3

I have etesync-dav hooked up to Thunderbird directly (now that it has native caldav / carddav support). I saw this same behavior, though, when using TbSync. I have been experiencing this behavior for many versions of EteSync, so it's not 0.31.2.

There is a long pause at END:VCALENDAR (line 90) and also a long pause at [2022-03-09 14:50:56 -0500] [29178/Thread-4] [DEBUG] Client provided sync token: '' (line 252)

as of

[2022-03-09 14:51:04 -0500] [29178/Thread-4] [INFO] REPORT response status for '/Some_Account_Name/il2_lR3-randomalphaalEIFE92d2owjk/' with depth '1' in 7.895 seconds: 207 Multi-Status (line 9619) the event has appeared.

linux-amd64-etesync-dav_logs5_sanitized.txt

BlueDrink9 commented 2 years ago

@tasn please can you update us with some idea of your dev capacity atm/intentions about looking into this / prioritisation compared to other bugs. Any customer communication on the issue would be welcome

tasn commented 2 years ago

Hey sorry, for missing this. Been focusing more on the server recently.

As for the issue, I still stand behind what I said before:

Interesting. We used to have this issue for people in NZ/Australia, but I thought we fixed it. The sync should now happen in the background so it shouldn't be blocking.

As for this:

For ref, I am in the UK, so presumably nearer to EteSync's servers?! Servers are in Europe, so yeah.

As for figuring out what's wrong: There is a sleep in the code:

        # At most wait for 5 seconds before returning stale data
        etesync.sync_thread.wait_for_sync(5)

So I wonder if this has anything to do with things.

I think the way to get it sorted is to add A LOT of debug prints to the EteSyncItem and Storage classes in etesync_dav/radicale/storage.py
And hope it gets us the information we need.

BlueDrink9 commented 2 years ago

Thanks for the update. I assumed you mean "I thought we fixed it (but may have been wrong and will revisit the fix)".

Regarding the debug prints, is that something you want one of us to do?

tasn commented 2 years ago

Thanks for the update. I assumed you mean "I thought we fixed it (but may have been wrong and will revisit the fix)".

It was 100% definitely fixed as confirmed by customers at the time. It looks like something else may have caused it to resurface. I'm just not sure what exactly, and without knowing where it hangs it'd be hard to figure it out.

Regarding the debug prints, is that something you want one of us to do?

If you can! Just litter the code with print("XXX", datetime.now()) (or there are tools that can maybe help with that). Essentially, we are looking for the line(s?) that block.

BlueDrink9 commented 2 years ago

Using the cProfiler class on etesync-dav, this is the pstats file I get out etesync_profile.zip .

I'll try and analyse it now but might be easier for you. (I tried using spyder to profile but it isn't working on my machine)

BlueDrink9 commented 2 years ago

profile.csv Converted to csv

BlueDrink9 commented 2 years ago

All I can see at the moment is that all of the delay is happening in select calls. Can't figure out where they're occurring

BlueDrink9 commented 2 years ago

Looks like almost all of the time can be attributed here https://github.com/etesync/etesync-dav/blob/edfe07daa1f1e973823e886d1b70312c78614bf5/etesync_dav/radicale_main/server.py#L282:

BlueDrink9 commented 2 years ago

Not sure how much else I can add from here. Let me know if you need more specific info and please suggest explicitly how to obtain it

tasn commented 2 years ago

This code is taken verbatim from Radicale, though maybe they changed something in a newer version. That would explain the break. I'll try to take a look at that.

Though I also think that this is just the top-most place in the code so it encompasses whatever is slow, so it could be that too.

tasn commented 2 years ago

I wonder, can you try now? We just deployed a fix that fixed some slow queries in the etesync server, I wonder if you were one of the unlucky affected ones.

BlueDrink9 commented 2 years ago

I expected it to be an encompassing issue, I was expecting you might direct me somewhere else to profile.

I tried again, still experiencing slow event creation and updates in Thunderbird. No change 😞

Peter4487 commented 2 years ago

I figured I'd try updating radicale to see what happened with this issue.

I installed radicale 3.1.7 from bullseye-backports, and when I run radicale --version I see 3.1.7.

When I run linux-amd64-etesyc-dav --version I see Radicale version: 3.0.3 ... is it in its own venv or something similar?

And, @tasn in the latest release notes (0.31.2) you say it's working for radicale 3.1.0...hmm. Should I be running 3.1.0? How do I do that?

Zvezdin commented 1 year ago

I can also confirm that I'm having this issue (and when I look, back, I think that I've always had it for the last few years). It appears both when using TBSync, but also when syncing the calendars directly without extensions. It might be related to #267 that I opened some time ago. Like the others have said, every write to etesync-dav takes about 10 secs, so the thunderbird UI blocks (e.g. if you dismiss a reminder, if you create or move an event, if you accept a calendar invitation, it's kind of a usability killer). Here's an example log line from me dismissing a reminder:

[2022-07-25 14:51:18 +0300] [87523/Thread-1412] [INFO] REPORT response status for '/XXX/' with depth '1' in 9.119 seconds: 207 Multi-Status. I also noticed that the etesync-dav process uses ~100% of the CPU while it's processing such a request (and I've got modern hardware).

Similarly to @Peter4487, etesync-dav --version returns 0.31.2 but radicale 3.0.3 . I'm using the mac binary.

Peter4487 commented 1 year ago

@tasn Is it possible to get some clarity on whether there is or will be work on this issue? I really appreciate your time spent on the project and know you have a lot to do in life, so no hard feelings either way. I would appreciate some clarity, though, because it's getting to be time to look for a new solution.

tasn commented 1 year ago

@Peter4487, I honestly don't know. I can potentially try towards the end of next week, but hard to commit. :(

tasn commented 1 year ago

Sorry, clicked send too soon: the main problem is, that it's an issue with Thunderbird, not with EteSync. Well, EteSync could do better, but Thunderbird definitely shouldn't be blocking the UI on this. Even if EteSync makes everything super fast, there's still the limit of the speed of light which means a hang of X seconds will happen anyway depending on how far you are from the server.

I really wish Thunderbird would just fix the thing. :(

Peter4487 commented 1 year ago

@tasn could you point us to the blocking Thunderbird issue?

BlueDrink9 commented 1 year ago

I thought that the current tb model was supposed to sync everything local-only, and etesync-dav would then have eventual consistency with the etesync main server. Is that process not quite so transparent to thunderbird?