Open iago-lito opened 11 months ago
So, this artefactual dupes mail file is twice the size of archive and contains only duplicates, right?
You're assuming TB properly recognizes the duplicated messages + meta-data as two distinct messages. That may not be the case. Also, TB may be failing to retrieve the message bodies properly when you manipulate mbox files like that.
Still, if you can send me a compressed mbox file you've generated this way (via email or even here with an attachment), with 2x2 messages, which is supposed to have 2 dupe sets of size 2, but is not found to have them - I could try to reproduce and work on a fix.
Please note that my availability under late this month is rather low.
Did you remove all other criteria?
Did you remove all other criteria?
Not when I wrote the OP, but I have tested now with only Body
selected and the same happens indeed.
You're assuming TB properly recognizes the duplicated messages + meta-data as two distinct messages. That may not be the case. Also, TB may be failing to retrieve the message bodies properly when you manipulate mbox files like that.
FWIU, mbox files are just text files containing all messages in a ^From
-separated sequence, so I think it does make sense to concatenate two files like this. I was also convinced when I saw that TB correctly interpreted the result.
if you can send me a compressed mbox file you've generated this way
There you go. This is not compressed but very small. I have crafted a toy example from only two dummy messages. The second file is just the concatenation of twice the first file so it contains no extra information. This is a rather minimal example that I have been able to reproduce the bug with:
I would be happy that these two urls not linger on online for too long. Can you please tell me when you have the files on your side so I can remove them?
Please note that my availability under late this month is rather low.
No worries, thank you for removedupes
<3
I'll try to find time to look at this next week; if I haven't please poke me again. With work, plus anti-war activities, plus other repositories of mine (cuda-api-wrappers) - I'm kind of swamped.
Take your time :) Do you have the the files on your side so I can take them offline?
Friendly ping @eyalroz, but maybe you're not out of the swamp yet..
Joining iago-lito. same issue for a while now (115). another friendly ping @eyalroz
I was suprised that
No duplicates were found
pretty much in any situation involving theBody
comparison criterium. So I closed Thunderbird and went under itsMail
folder, grabbed some rawarchive
mbox file and tried the following:So, this artefactual
dupes
mail file is twice the size ofarchive
and contains only duplicates, right?Opening Thunderbird again, right-clicking on the new
dupes
mail folder and searching duplicates yieldedNo duplicates were found.
again. I therefore suspect there is a bug in theBody
comparison.