enram / vptstools

Python library to transfer and convert vertical profile time series data
https://enram.github.io/vptstools/
MIT License
4 stars 1 forks source link

Improve upload handling #39

Open stijnvanhoey opened 7 years ago

stijnvanhoey commented 7 years ago

In order to speed up the uploads towards the S3, the handling of multiple files at the same time would be a huge improvement. A first option would be working with async, but as the boto3 libraryr currently not yet support async handling, this approach will yet fail to work. Working with multiple threads or working parallel would be an valid option to implement.

peterdesmet commented 1 year ago

@stijnvanhoey This is an old issue from data-repository. Is this implemented in any way in vptstools?

stijnvanhoey commented 1 year ago

The vpts-creation has been implemented parallel (using multiple processes, see https://github.com/enram/vptstools/blob/main/src/vptstools/vpts.py#L256-L264), the handling of multiple daily/monthly files at the same time has not. This was technically possible, but the foreseen server only had 1 core and very limited RAM. e.g. the vpts-creation uses the single core and parallelism is not used operationally.

This could be updated within the context of the new deployment. Each daily-file creation is self-sustained and could be parallelized instead of run in a for-loop. This would require some refactoring from the vph5-CLI though.