In the latest version of DeepVariant (0.8.0) gcsfuse binary has been removed from the docker image. In order to be able to run make_example on multi core machines we still need to start multiple gcsfuse processes (one per core).
We achieve this by launching PAPI using a json file that includes action list in the following order:
mkdir local directories for gcsfuse
gcsfuse process 1
gcsfuse process 2
...
gcsfuse process n
seq 1 n | parallel make_example
This will be essentially equivalent to what we were doing using DeepVariant 0.7.2
I ran a bunch of profiling experiments to compare this new implementation of gcsfuse to our current one (which gcsfuse binary exists in deepvariant docker image) and the performance is almost identical.
In the latest version of DeepVariant (0.8.0) gcsfuse binary has been removed from the docker image. In order to be able to run make_example on multi core machines we still need to start multiple gcsfuse processes (one per core). We achieve this by launching PAPI using a json file that includes action list in the following order:
This will be essentially equivalent to what we were doing using DeepVariant 0.7.2