The "new" (Sept 2021) strategy for creating the tar.gz archive files of KGE file sets uses a Linux CLI script (kgea_archive.bash) that caches KGX nodes and edges TSV files on the hard drive, just prior to running the tarprogram (generating a tar version of the archives). This points to the requirement for a suitably large hard disk drive to accommodate the caching.
A possible (perhaps necessary) KGEA system enhancement is to somehow dynamically allocate a "large enough" temporary EBS volume for the operation (this is akin to provisioning temporary compute EC2's, but just storage...).
The alternative is to somehow either provision a large enough disk right up front (probably costly and wasteful for most downloads) or to limit archive sizes of the uploaded files (again, not too satisfactory).
The "new" (Sept 2021) strategy for creating the tar.gz archive files of KGE file sets uses a Linux CLI script (
kgea_archive.bash
) that caches KGX nodes and edges TSV files on the hard drive, just prior to running thetar
program (generating a tar version of the archives). This points to the requirement for a suitably large hard disk drive to accommodate the caching.A possible (perhaps necessary) KGEA system enhancement is to somehow dynamically allocate a "large enough" temporary EBS volume for the operation (this is akin to provisioning temporary compute EC2's, but just storage...).
The alternative is to somehow either provision a large enough disk right up front (probably costly and wasteful for most downloads) or to limit archive sizes of the uploaded files (again, not too satisfactory).