ricklamers / gridstudio

Grid studio is a web-based application for data science with full integration of open source data science frameworks and languages.
GNU Affero General Public License v3.0
8.88k stars 1.5k forks source link

How is the limit of opening CSV file? #37

Closed RyuAsuka closed 5 years ago

RyuAsuka commented 5 years ago

My CSV file contains about 700,000 rows x 61 columns (57 MB). And it cannot be opened in gridstudio, even my computer (8GB RAM running Ubuntu 18.04) stopped response. Can this software open large file? But the file can be directly imported using Pandas in Python.

ricklamers commented 5 years ago

HI RyuAsuka, at the moment this would entail creating approx 42M cells (structs) in Grid studio in the Go backend. At the moment, I don't think the back-end is optimized enough to handle such a high number of cells. What I would recommend in the meantime is to read in the CSV in Python in Pandas and do your operations on the dataframe. You could still use the sheet for a part of the data or summaries of it.

Perhaps in the future we could make the back-end more optimized to support such large CSV files. I think at this point even Excel (which is highly optimized) would have a hard time with your CSV.