HumanBrainProject / neuroglancer-scripts

Conversion of neuroimaging data for display in Neuroglancer
MIT License
27 stars 18 forks source link

enh: improve read perf of http accessor by using requests.Session #33

Closed xgui3783 closed 1 year ago

xgui3783 commented 1 year ago

According to https://requests.readthedocs.io/en/latest/user/advanced/#session-objects , it appears that by using requests.Session, the underlying TCP connection will be reused, which should drastically improve repeated get requests to the same host.

From the (admittedly) rough test script[1], we see promising improvement (execution time cut by ~50% in sync reads and ~30% in threaded reads) [2]

[1] rough script used for testing both sync and threaded operations, we used time python <script_name>.py and manually recorded the result

from neuroglancer_scripts.http_accessor import HttpAccessor
from concurrent.futures import ThreadPoolExecutor
from itertools import repeat

multiple = 2

def get_args():
    return [
        (x, x+64, y, y+64, z, z+64)
        for x in range(0, 64*multiple, 64)
        for y in range(0, 64*multiple, 64)
        for z in range(0, 64*multiple, 64)
    ]

def main():
    accessor = HttpAccessor("https://neuroglancer.humanbrainproject.eu/precomputed/BigBrainRelease.2015/8bit")
    for arg in get_args:
        accessor.fetch_chunk("20um",arg)

def threaded():
    accessor = HttpAccessor("https://neuroglancer.humanbrainproject.eu/precomputed/BigBrainRelease.2015/8bit")
    with ThreadPoolExecutor(max_workers=4) as ex:
        for val in ex.map(
            accessor.fetch_chunk,
            repeat("20um"),
            get_args()
        ): pass

[2] truncated result

multiple sync or threaded previous current
2 sync 0m0.766s 0m0.315s
10 sync 1m24.788s 0m24.446s
2 threaded 0m0.336s 0m0.207s
10 threaded 0m24.379s 0m6.099s
xgui3783 commented 1 year ago

failed tests does not seem to be caused by this PR, but rather a more strict parsing of gifti file. see trace https://github.com/HumanBrainProject/neuroglancer-scripts/actions/runs/5541279128/jobs/10114475201#step:6:54

I can fix the issue in this PR too, or would you prefer to fix it in a separate PR, and I can rebase/merge this PR?

codecov[bot] commented 1 year ago

Codecov Report

Merging #33 (47fd3b1) into master (b7b5047) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #33   +/-   ##
=======================================
  Coverage   92.61%   92.61%           
=======================================
  Files          25       25           
  Lines        1489     1490    +1     
  Branches      219      219           
=======================================
+ Hits         1379     1380    +1     
  Misses         63       63           
  Partials       47       47           
Impacted Files Coverage Δ
src/neuroglancer_scripts/http_accessor.py 100.00% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more