Closed vielhuber closed 2 years ago
Hi @vielhuber
Thanks for opening the issue!
I have two doubts about this functionality.
Firstly, in terms of the library, adding functionality to do partial backups is a delicate matter, and there may be a million different use cases in this area for different users.
Principally, I think that technically, this may be difficult...
Due to the current implementation, the application does not know the dates of emails it hasn't downloaded. So, to decide where the first email of "today - N" is, it would have to scan through all remote emails, until it found the first one. It couldn't rely on checking the emails it's already downloaded, as this would fail if backups are not done for N days.
In order to get over this problem, one would probably have to implement something like a bisection search over metadata, which is a major job.
If I understand your use case correctly, you have mailboxes that are very big before you start running backups.
So, you want to start by ignoring most (or all) of the previous emails, and just start from the most recent ones.
If the above describes your needs, you could get a list of email ids (UIDs) before the first backup and load them into the ".imap" file that imap-backup uses to decide what to backup. That way only successive emails would get backed up.
Here's a possible implementation for a single folder, YMMV!
#!/usr/bin/env ruby
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "imap-backup", git: "https://github.com/joeyates/imap-backup"
end
email = ARGV[0] or raise "Please supply an email"
folder_name = ARGV[1] or raise "Please supply a folder"
connections = Imap::Backup::Configuration::List.new
account = connections.accounts.find { |a| a[:username] == email }
raise "#{email} is not a configured account" if !account
connection = Imap::Backup::Account::Connection.new(account)
folder = Imap::Backup::Account::Folder.new(connection, folder_name)
serializer = Imap::Backup::Serializer::Mbox.new(connection.local_path, folder_name)
uids = folder.uids - serializer.uids
serializer.apply_uid_validity(folder.uid_validity)
uids.each do |uid|
message = <<~MESSAGE
From: fake@email.com
Subject: Message #{uid} not backed up
Skipped #{uid}"
MESSAGE
serializer.save(uid, message)
end
Usage:
$ ./stuff-uids.rb EMAIL FOLDER
An "ignore everything before today" command like the above could be added to imap-backup, but, again, we would need to consider how widely useful this is.
Thank you for this detailed and precise answer.
I have successfully tested the solution "stuff-uids.rb", which works as intended.
Unfortunately, what is still missing for a productive use, is the possibility that the folder names are determined automatically (each account has a different folder structure).
Is it possible to add this in the script or give me a hint how to achieve this?
In general, I share your concerns and challenges with this extension, but I would like to name one more use case:
With many backup solutions (even those that are incremental), it makes more sense to have multiple small files than one very large file. So my plan would be:
This way you have a weekly archive with small files that you can backup very well.
So also the problem would be solved, if accidentally deleted e-mails, I can restore them afterwards very easily.
@vielhuber
In the program, you would have to put the existing code in a loop, for each folder:
#!/usr/bin/env ruby
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "imap-backup", git: "https://github.com/joeyates/imap-backup"
end
email = ARGV[0] or raise "Please supply an email"
def fill_folder_with_dummy_messages(connection, folder_name)
folder = Imap::Backup::Account::Folder.new(connection, folder_name)
return if !folder.exist?
Imap::Backup.logger.info "Folder '#{folder_name}'"
serializer = Imap::Backup::Serializer::Mbox.new(connection.local_path, folder_name)
uids = folder.uids - serializer.uids
Imap::Backup.logger.info "#{uids.length} messages"
serializer.apply_uid_validity(folder.uid_validity)
uids.each do |uid|
message = <<~MESSAGE
From: fake@email.com
Subject: Message #{uid} not backed up
Skipped #{uid}
MESSAGE
serializer.save(uid, message)
end
end
connections = Imap::Backup::Configuration::List.new
account = connections.accounts.find { |a| a[:username] == email }
raise "#{email} is not a configured account" if !account
connection = Imap::Backup::Account::Connection.new(account)
Imap::Backup.logger.info "Filling local folders for #{email} with dummy messages"
connection.folders.each do |folder_name|
fill_folder_with_dummy_messages(connection, folder_name)
end
On the other hand, instead of modifying the code, you could get the list of folders via the command line, then call the existing script for each folder.
You can list all the folders for an account like this:
$ imap-backup folders --accounts EMAIL
It outputs the email, followed by all folders.
Thank you very much, this works very well and the concept I think is quite flexible. Perhaps you could consider adding it into core.
Just one enhancement for the above script (I needed this):
Line 7:
Before: gem "imap-backup", git: "https://github.com/joeyates/imap-backup"
After: gem "imap-backup", git: "https://github.com/joeyates/imap-backup", branch: "main"
Since branch v4.0.7 (and even v4.0.6) I get an error running the above script:
stuff-uids.rb:32:in `<main>': uninitialized constant Imap::Backup::Configuration::List (NameError)
Hi @vielhuber
The code around Configurations and Accounts has changed over the last few versions.
The functionality you need should be available now via the ignore-history
command:
$ imap-backup utils ignore-history EMAIL
Awesome, seems to work.
One note after the last update to 4.0.7:
I get this notice when running any command:
/var/lib/gems/2.7.0/gems/thor-1.1.0/lib/thor/error.rb:105: warning: constant DidYouMean::SPELL_CHECKERS is deprecated
Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
Perhaps you could have a look at that.
Unfortunately, it does not work: It just creates an empty folder but without any mbox files in it.
imap-backup utils ignore-history my@email.com
I get a warning (but this should have nothing to do with the problem, since I'm getting this warning on every call:
Calling `DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)' has been deprecated. Please call `DidYouMean.correct_error(error_name, spell_checker)' instead.
I also updated to ruby 3.0.0.
Do you have any suggestions?
Hi @vielhuber
I've pushed a bugfix for your ignore-history problem as version 4.1.1
From a quick search, I think the DidYouMean::SPELL_CHECKERS.merge!(error_name => spell_checker)
warning relates to a problem with Bundler.
Thank you very much. 4.1.1 works without any problems and also without the SPELL_CHECKERS warning.
Hello!
Would it be possible to add a flag that only the emails of the last N days are backuped?
We have very large mailboxes and suffer from too small caches, too long upload times etc.
If imap-backup would backup e.g. only mails from the last N days, we could overcome those issues.