wids-eria / ADAGE

Assessment Data Aggregator for Gaming Environments
13 stars 15 forks source link

Provide data export for large data sets #2

Open allisonsalmon opened 10 years ago

allisonsalmon commented 10 years ago

Currently exporting very large data sets will time out. Consider other option for export like breaking the data set into manageable chunks. Optimize queries and CSV export code path.

briandk commented 10 years ago

The Short Version

@mberland and I thought the timeout might be because the server didn't have enough RAM. Now @igoodin and I think what's happening is just that mongo can't prepare the data fast enough for Rails.

Longer Version

@mberland and I thought the slowness/timeout issue might be because the server didn't have enough RAM. If the dataset being exported exceeds the available RAM, we'd expect pageouts and eventually timeouts.

But, I talked to @igoodin and he's been profiling the server performance. We're now fairly confident it's not a RAM issue, because requests haven't been pushing the machine to its free memory ceiling.

@igoodin thinks it's more likely an issue with our mongo data. AFAIK, because the kodu implementation captured EVERYTHING, mongo has to do a great deal of work just to prepare the data for export. That's probably what's causing the timeout issue, but we might be able to attack it by changing what happens when the queries are handled in Rails.

allisonsalmon commented 10 years ago

Great! Thanks for following up on that. I agree with @igoodin. I think a large part of the problem probably is actually in the code that converts the data to CSV for export. A good further investigation would be to see if this same data set also times out when retrieved as JSON and not CSV.

allisonsalmon commented 10 years ago

After further investigation we will likely also want to use Unicorn and streaming to tackle this issue.