ecotaxa / ecotaxa_back

Backend of the EcoTaxa application
GNU General Public License v3.0
6 stars 5 forks source link

Large export is not re-importable (ImportError: You tried to import too many files, max. is 1000) #48

Open moi90 opened 2 years ago

moi90 commented 2 years ago

This project consists of many samples (4060 to be exact): https://ecotaxa.obs-vlfr.fr/prj/6433

When exporting with exp_type=BAK, split_by=S, the archive contains as many individual TSV files. (Guessing from the UI, split_by should be ignored when doing a BAK export, but I consider this a feature.)

However, when re-importing the same data, I get an import error:

    You tried to import too many files, max. is 1000

    File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap self._bootstrap_inner()
    File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run()
    File "/app/BG_operations/JobScheduler.py", line 40, in run sce.run_in_background()
    File "/app/API_operations/helpers/JobService.py", line 73, in run_in_background self.do_background()
    File "/app/API_operations/imports/Import.py", line 76, in do_background self.do_validate()
    File "/app/API_operations/imports/Import.py", line 115, in do_validate how, diag, nb_rows = self._collect_existing_and_validate(source_dir_or_zip, loaded_files)
    File "/app/API_operations/imports/Import.py", line 142, in _collect_existing_and_validate source_bundle = InBundle(source_dir_or_zip, bundle_temp_dir)
    File "/app/BO/Bundle.py", line 54, in __init__ one_more()
    File "/app/BO/Bundle.py", line 49, in one_more raise ImportError("You tried to import too many files, max. is %d" % self.MAX_FILES)
    ImportError: You tried to import too many files, max. is 1000

This limitation seems somewhat arbitrary and I think, EcoTaxa should be able to read the data it itself has emitted.

grololo06 commented 1 year ago

Hello, the commit for this limit is linked to https://github.com/ecotaxa/ecotaxa_front/issues/675 which is a legitimate attempt to protect the system from some kinds of errors. But indeed a BAK should be readable.

moi90 commented 1 year ago

Maybe, this can be resolved on the export side then? Split files if containing more than 1000 objects? But this might break things on the user's side...