eyalroz / removedupes

Remove Duplicate Messages
https://addons.thunderbird.net/en-US/thunderbird/addon/removedupes/
Other
87 stars 6 forks source link

Implement more visible progress indication (e.g. progress bar) during dupe check #105

Open kriegaex opened 1 year ago

kriegaex commented 1 year ago

While a dupe check is running, possibly taking half an hour or longer due to IMAP refresh, there is no easy way for the user to find out if the extension is still actually doing something, has run into an error or anything else. It would be helpful to see a modal progress dialogue (in which I can also cancel, if it is taking too long and I want to continue writing e-mails instead) or at least messages in the status bar. Even if errors do occur, they are only written into the logs, not show to the user.

eyalroz commented 1 year ago

There's progress indication in the task bar, counting "X of Y messages processed". Do you not see it?

kriegaex commented 1 year ago

I do not see it.

eyalroz commented 1 year ago

Are you getting any errors in the developer console? Those might explain why the indication is missing.

kriegaex commented 1 year ago

The errors in #108 sometimes.

BenjamenMeyer commented 1 year ago

I'll second the lack of progress indictator.

In my case, something random happened (don't know exactly what or how or why) and I ended up getting a recursive duplicates of my local storage inbox with 19 embedded copies. (All IMAP data is pulled to local and then processed.) My data storage is normally around 20 GB right now; however, this exploded the size to 330 GB. I'm trying to use this to dedup and cleanup and get back to where I should be. The UI is currently in a semi-hung state - it's not responding 99% of the time but enough that the application is still continuing to work. (BTW I set it to offline mode before starting.) When it first started I saw the progress bar going through while it was discovering folders, but that stopped at some point, and no I have no feedback at all.

It'd be great if this could be split out to a secondary UI where updates could be more frequently sent and numbers tracked. I'd submit PRs but JS isn't my forte and I don't have the time to pick it up right now and learn a whole new ecosystem. (I'm trying to recover from the disaster that occurred.)

eyalroz commented 1 year ago

In the mean time, I've restored status bar visibility... Please try this build please:

removedupes_0.5.3b2_tbird.zip

... and let me know if you see progress indication on the status bar every few seconds.

kriegaex commented 1 year ago

I am so sorry, but this issue is now 5 months old. Meanwhile, I have cleaned out all relevant dupes in my big e-mail database, so there is nothing I can test at the moment. You probably understand that I cannot just delete important messages from my IMAP accounts in order to run a test for the improved status bar visibility. I was only able to test while I still had the original problem(s), resulting in the issues I created here and which you so kindly answered.

eyalroz commented 1 year ago

@kriegaex : Ah, I should have made my suggestion clearer... you don't need to have any duplicates, just do a dupe search on, well, all of your account, or all of your local folders etc. Nothing will get moved to trash if you have dialog review, regadless of whether there are dupes or not.

But of course I realize that you might be wary of running anything with the word "removal" in it on important emails, regardless of any assurances I can make...

kriegaex commented 1 year ago

Hm, maybe we have a misunderstanding here, because the subject line ends with "during dupe check". I think it was rather about dupe removal, because IIRC dupe check as such is super fast, while the removal process takes ages in my IMAP accounts, causing the need for a progress bar. When I am back near my other computer with TB installed, I can however give it a spin and see if I somehow misremembered and it was about dupe check. Stay tuned. Maybe tomorrow, I can take a look.

eyalroz commented 1 year ago

@kriegaex : Oh... oh. Right. Then, I did misunderstand you. Notification during dupe removal... I never thought about that. But - Thunderbird itself should inform you about deletion progress. I don't delete messages "myself", I ask TB to do it. My code doesn't get any progress reports, it's fire and forget.

As for the speed of dupe checks - try checking half a million messages, or try comparing bodies as well as headers, and you can see it can take some time.

kriegaex commented 1 year ago

I tried to compare with and without full text in combination with other criteria like date or subject. In all cases, across thousands of messages, the search result popped up in less than a second. When I compared by message text only, it took a few seconds, but that was OK. I tested this with the latest release, not with your preview version. For me, there really is no pressing need for this improvement, because like I said, the slow part was the removal process, during which I got no progress feedback and no idicator how much longer it would take.

I am sure, your improvement works as expected, so please keep it, but it is not what I had in mind here. I apologise, if my description was unclear and caused you extra work.

Notification during dupe removal... I never thought about that. But - Thunderbird itself should inform you about deletion progress. I don't delete messages "myself", I ask TB to do it. My code doesn't get any progress reports, it's fire and forget.

Is there any way to make the "fire & forget" part more intelligent, e.g. by issuing deletion on IMAP accounts message by message, in chunks of 10 (maybe user-configurable, the default being all) or whatever? I have not thought this through, I am just babbling in rubber duck mode, hoping to inspire ideas in your mind somehow.

eyalroz commented 1 year ago

Yes, something like that is possible. It could theoretically slow down the deletion overall, but progress reporting is important. I already separate deletions by source folder, and I could update the status bar between folders. I could also break up the per-folder sets of headers, and delete-and-update. Another possibility is to use timeout-callbacks and check on the deletion status every once in a while.

BenjamenMeyer commented 1 year ago

Yes, something like that is possible. It could theoretically slow down the deletion overall, but progress reporting is important. I already separate deletions by source folder, and I could update the status bar between folders. I could also break up the per-folder sets of headers, and delete-and-update. Another possibility is to use timeout-callbacks and check on the deletion status every once in a while.

Even if it slows down the process it might keep Thunderbird more responsive. I am still working through my inbox from an incident in December and having to clean up; consolidating copies and then using your dedup tool on 10's of thousands (sometimes well over 100k) messages and Thunderbird often locks up in its own processing, especially when it's rebuilding the index data. Even if it takes longer, being able to see progress and having the GUI functional would be a major improvement.

I know, the real improvements need to happen on Thunderbird's side itself; but any little improvement can go a long way.