datajoint / datajoint-matlab

Relational data pipelines for the science lab
MIT License
42 stars 37 forks source link

Parallelize file transfer for external blobs #367

Open shenshan opened 3 years ago

shenshan commented 3 years ago

Feature Request

Problem

Fetching from an internal blob field and from external blob fields have big difference in performance.

Same to this issue on the python site: https://github.com/datajoint/datajoint-python/issues/806

Requirements

We should consider parallelizing the file downloads from external storage.

Benchmark

Here is a benchmark plotting showing the performance difference. This is for princeton database, running on a computer in the lab. The external storage on a shared drive in the institute.

image

shenshan commented 3 years ago
Screen Shot 2021-06-04 at 12 08 06 PM Screen Shot 2021-06-04 at 11 38 29 AM