Open bkmgit opened 2 years ago
Do you have any idea?
What should we care about it? Local storage size? Download time? ...?
wget -c
is helpful for continuing an incomplete download without starting again. If the data is stored on the cloud in a suitable form, one can stream the interesting portion, but this requires infrastructure allows this and perhaps is another step for the future. At present want to consider datasets upto 100 Gb which may be analyzed on a workstation.
It may be good to have a different way to work with large datasets. For example the https://ldbcouncil.org/benchmarks/graphalytics/ data sets are 1.1Tb in total.