Open datatraveller1 opened 1 year ago
Not an expert on the project but I believe the data is loaded to Memory(RAM). If you give enough RAM this can scale bigger.
I have tested this for one of my projects with over 200M rows and it took around 43 GB RAM for the command to succeed. I tested under various scenarios of file changes with 10% of file changed vs. 50% of file changes.
version
: csvdiff version 1.4.0, running on MS Windows 11 command:csvdiff file1.csv file2.csv -p0,6,1,13,18 > csvdiff_result.csv
After a few minutes the program crashes with:fatal error: runtime: cannot allocate memory
file1.csv: 8.0 GB, ~ 35.000.000 rows, 22 columns file2.csv: 9.5 GB, ~ 41.000.000 rows, 22 columns
I assume these files are too big for csvdiff?