Open JrtPec opened 8 years ago
Tmpo blocks consist of gzipped json. So why not just put the tmpo blocks directly on the wire and offload the CSV conversion work to the browser? With proper HTTP encoding set, the browser will take care of inflating the gzip.
We could do it that way, but you could only download raw data that way, right? So people would have to convert epoch timestamps, interpolate data, resample it... while the exact purpose of the csv-download page was to enable non-programmers to import data into excel or something and experiment on their own. I don't know if raw data would be very useful for these people...
I'm going to try and write a generator that creates small dataframes and streams them, like this
@JrtPec we discussed this last meeting. What is the status now that our droplet has more memory and swap?
It seems to be much better, but I can still crash the site when selecting a large time period. We could put a cap on the time period, or figure out some clever way to call tmpo in chunks and stream the csv in blocks.
Yesterday we succeeded in getting CSV's to generate from TMPO live on the website and send them to the browser. We however noticed that each request uses some memory and fails to free it afterwards. After a few requests the server unavoidably crashes.
We have tried following things to reduce memory load and free it up after the request, however none have really worked.
app.use_x_sendfile = True
, to have nginx serve the file directly instead of the app. (I did not thoroughly test this, not sure of its effect)del df
import gc; gc.collect()
(link)Does anybody have other ideas we could try? The download page is live, but hidden under opengrid.be/download. The status quo is that it does work, however after a few runs it will crash the server, which then immediately restarts.