Feature request: adding compress flag to compress auxiliary files / candidate files
Currently, running table linker over thousands of files will generate huge amount of data (e.g., ~80GB / 1000 tables in my case). However, it's possible (may be often) that users have a dataset of ten thousands of tables that they need to link, which users may not have enough disks to store the results.
One possible solution is to add a compress flag (--compress) to table linker command indicating that the input/output should be compressed. To support this feature, if the compression is enabled, we only need change from open to gzip.open, and everything else stays the same.
Feature request: adding compress flag to compress auxiliary files / candidate files
Currently, running table linker over thousands of files will generate huge amount of data (e.g., ~80GB / 1000 tables in my case). However, it's possible (may be often) that users have a dataset of ten thousands of tables that they need to link, which users may not have enough disks to store the results.
One possible solution is to add a compress flag (
--compress
) to table linker command indicating that the input/output should be compressed. To support this feature, if the compression is enabled, we only need change fromopen
togzip.open
, and everything else stays the same.