antonycourtney / tad

A desktop application for viewing and analyzing tabular data
http://tadviewer.com
MIT License
3.18k stars 118 forks source link

Specifiy what "large files" mean #156

Open buhtz opened 3 years ago

buhtz commented 3 years ago

You support "large files".

Can you please give details about the technical limits in rows, columns, file size on disk?

A real world example from my workflow: 150.000.000 (in words: Hundret and fifthy millions) lines with 32 columns and ~15 GB uncompressed file size on disc.

Of course I "work" with that data primary via a script (Python/Pandas). But sometimes I just need to look on the table with my own eyes.

riekusr commented 1 year ago

Did you try?

buhtz commented 1 year ago

No, I didn't.

NAjustin commented 1 year ago

I know this issue is quite old, but I have tried and can throw in some details.

In my experience the UI fails completely with CSVs over ~100MB (sometimes smaller—not sure what the factors are here, whether it's size or value complexity or number of columns or some combination of all three). The interface just remains blank, the file shows in the list but nothing else ever loads on either screen. I'm on Windows, for what it's worth.

We, like @buhtz, have many CSVs that enter our workflow into the many GB territory. Interestingly I can open many of these same files in DuckDB, but in Tad they just result in the blank interface and no error.

I agree that it would be very helpful to clarify this when the claims are made, but also to figure out if there's a different loading strategy that could be used to make this work within Tad as it would be incredibly valuable to be able to quickly view and summarize files like these as part of an analysis/data engineering workflow.