Closed zmoon closed 1 year ago
Some initial timings suggest we can gain speed by using s3
URL scheme instead of current HTTPS:
In [8]: %timeit -n 10 pd.read_json("https://openaq-fetches.s3.amazonaws.com/realtime/2013-11-26/2013-11-26.ndjson",
...: lines=True)
311 ms ± 37.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [9]: %timeit -n 10 pd.read_json("s3://openaq-fetches/realtime/2013-11-26/2013-11-26.ndjson", lines=True)
159 ms ± 13.5 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
similar to #113
local time calc
convert to ppmv
[x] at least one test (maybe find a way to set things to load just one file for testing)