BU-ISCIII / relecov-tools

set of helper tools for the assembly of the different elements in the RELECOV platform (Spanish Network for genomic surveillance of SARS-Cov-2) as data download, processing, validation and upload to public databases, as well as analysis runs and database storage.
GNU General Public License v3.0
5 stars 21 forks source link

Handle ENA API timeouts in upload-to-ena module #241

Open Shettland opened 11 months ago

Shettland commented 11 months ago

When uploading multiple fastq files, ENA's API ussually throws a timeout that makes the process to crash. upload_to_ena currently handles it by dividing the total number of samples into batches of 20 samples (size arbitrarily selected) but sometimes a timeout is sill thrown anyways so some sort of retry should be implemented

Shettland commented 10 months ago

Some sketch code that could work for this issue:

for file in file_list:
         if not self.upload_file_to_ena(file):
                for _ in range(n):
                    if self.upload_file_to_ena(file):
                        uploaded_files.append(file)
                        break
                else:
                    log.info("Unable to fetch %s from %s", file, folder)
                    failed_files.append(file)
return uploaded_files, failed_files

This code would try to upload, and if it fails, it would try again n times, breaking the loop if successful and printing an error message if it could not be uploaded