Using s3fs to mount aws buckets is convenient because it automatically syncs local changes back to the bucket, but because these changes happen synchronously the user experience is very bad for many common tasks. Practically, this means:
folders that are synced to buckets can't be used like local directories; they should only be used in applications where the benefits outweigh the overhead costs
we may need to investigate other ways of providing persistent storage
Examples of problematic tasks:
(Note that all times given below are rough tests using a local development environment on a network that has a 250Mbps connection up and down. I still need to verify that running in cloudsim gives similar performance.)
Recursive query commands like ls -Rfind are very slow because s3fs only requests file info when it is needed, so queries go all the way out to the bucket, one at a time.
Recursive operations like rm -rf are even worse, since they require querying the bucket and then writing back to it.
Querying the status of a git repository is very onerous because of the number of files that need to be retrieved and updated.
Running git status in an unmodified clone of vrx took
Listing the contents with ls -R took 44s, and deleting it took 3m48s.
Running ls -R for the first time in a workspace in which the VORC environment had been built took over 10 minutes. Running a second time right away took 86 seconds, so there is a considerable benefit from caching, but not enough to provide a good user experience.
Also, even after listing the whole directory, rebuilding the (unmodified) project with catkin_make took over 5 hours and then failed.
General comments
Directories that store lots of little files, such as those produced by cmake build commands or those used by git repositories are especially difficult to work with.
This means the normal ros development pattern is a bad match.
It does seem like this could be useful for storing larger, non-ephemeral files, like bag files, screenshots, videos or other media files.
Using s3fs to mount aws buckets is convenient because it automatically syncs local changes back to the bucket, but because these changes happen synchronously the user experience is very bad for many common tasks. Practically, this means:
Examples of problematic tasks:
(Note that all times given below are rough tests using a local development environment on a network that has a 250Mbps connection up and down. I still need to verify that running in cloudsim gives similar performance.)
ls -R
find
are very slow because s3fs only requests file info when it is needed, so queries go all the way out to the bucket, one at a time.rm -rf
are even worse, since they require querying the bucket and then writing back to it.git status
in an unmodified clone of vrx tookls -R
took 44s, and deleting it took 3m48s.ls -R
for the first time in a workspace in which the VORC environment had been built took over 10 minutes. Running a second time right away took 86 seconds, so there is a considerable benefit from caching, but not enough to provide a good user experience.General comments