We have a need to test and benchmark various ipfs functions against real world, file system based package repositories that can reach significant sizes, greater than 1TB in some cases.
We could simply rsync down copies of these repositories from nearby mirrors on every run but that could add up to 3 hours to a test run, costing extra bandwidth and load on community-ran mirrors.
Instead what I'd like to propose is that we set up a system that would allow us to download and keep a copy of each of our test case repositories and be able to mount them as a file system within a server (or container) that is actually running the test/benchmark.
If we're using AWS, that could be mounting an EBS or EFS, or if we're using something else, we can do regular network mounts, as long as the boxes are in the same data center I don't think it'll cause bottlenecks.
I'd guess we'd want to keep the repositories read-only, and not keep them up to date with the upstream repositories, so we can repeat tests with exactly the same contents each time.
Along side this, having some way of easily running a test/benchmark on AWS with a given repository mounted in it will mean that developers can push a branch to GitHub and kick off a test run without needing to download and keep copies of these huge directories themselves.
We have a need to test and benchmark various ipfs functions against real world, file system based package repositories that can reach significant sizes, greater than 1TB in some cases.
We could simply rsync down copies of these repositories from nearby mirrors on every run but that could add up to 3 hours to a test run, costing extra bandwidth and load on community-ran mirrors.
Instead what I'd like to propose is that we set up a system that would allow us to download and keep a copy of each of our test case repositories and be able to mount them as a file system within a server (or container) that is actually running the test/benchmark.
If we're using AWS, that could be mounting an EBS or EFS, or if we're using something else, we can do regular network mounts, as long as the boxes are in the same data center I don't think it'll cause bottlenecks.
I'd guess we'd want to keep the repositories read-only, and not keep them up to date with the upstream repositories, so we can repeat tests with exactly the same contents each time.
Along side this, having some way of easily running a test/benchmark on AWS with a given repository mounted in it will mean that developers can push a branch to GitHub and kick off a test run without needing to download and keep copies of these huge directories themselves.
cc @ipfs/wg-infrastructure