RSE-Cambridge / data-acc

Data Accelerator: Creates a burst buffer from generic hardware and integrates it with Slurm https://www.hpc.cam.ac.uk/research/data-acc http://www.stackhpc.com
https://rse-cambridge.github.io/data-acc
Apache License 2.0
17 stars 11 forks source link

Cannot stage data in to/out of private buffers #110

Closed jsteel44 closed 4 years ago

jsteel44 commented 5 years ago

On the DAC nodes, The /mnt/dac/$JOBID_job_private symlink does not get created, I guess this is because it links to a compute hostname.

I guess one would want to copy to private buffers to give every compute node the same data but its own copy. I would expect it to copy to /mnt/dac/$JOBID_job/private/*/ and copy out would create a directory tree eg copying out $DW_JOB_PRIVATE/file would copy out for eg:

private/node1/file
private/node2/file

But how feasible is this or do we not support copying into and out of private buffers?

jsteel44 commented 5 years ago

I guess one way to do it would be to stage into $DW_JOB_STRIPED and then in the job do an srun cp $DW_JOB_STRIPED/file $DW_JOB_PRIVATE/file for each compute node to do the copy, and then a srun mkdir -p $DW_JOB_STRIPED/$HOSTNAME && srun cp $DW_JOB_PRIVATE/file $DW_JOB_STRIPED/$HOSTNAME/file and then stage out from $DW_JOB_STRIPED. Seems a bit of a roundabout way to do it but that should work.

jsteel44 commented 4 years ago

This isn't supported and I'm writing a section in the documentation to cover this.