Open pxlxingliang opened 7 months ago
It's not related to the remote machine, but it seems you didn't have the access to chown on the local machine.
Could you try to add --no-perms
flag to rsync
?
Could you try to add
--no-perms
flag torsync
?
I have try to add this flag, but it did not work:
^CTraceback (most recent call last):
File "/root/anaconda3/lib/python3.8/site-packages/dpdispatcher/submission.py", line 273, in try_download_result
self.download_jobs()
File "/root/anaconda3/lib/python3.8/site-packages/dpdispatcher/submission.py", line 501, in download_jobs
self.machine.context.download(self)
File "/root/anaconda3/lib/python3.8/site-packages/dpdispatcher/ssh_context.py", line 675, in download
self._get_files(
File "/root/anaconda3/lib/python3.8/site-packages/dpdispatcher/ssh_context.py", line 905, in _get_files
self.ssh_session.get(from_f, to_f)
File "/root/anaconda3/lib/python3.8/site-packages/dpdispatcher/ssh_context.py", line 376, in get
return rsync(
File "/root/anaconda3/lib/python3.8/site-packages/dpdispatcher/utils.py", line 137, in rsync
raise RuntimeError(f"Failed to run {cmd}: {err}")
RuntimeError: Failed to run ['rsync', '-az', '--no-perms', '-e', 'ssh -o ConnectTimeout=10 -o BatchMode=yes -o StrictHostKeyChecking=no -p 65023 -q -i sugon', '-q', 'abacus@cancon.hpccube.com:/public/home/abacus/tmp/013b6a211b33560666b55f011a60f9771da63b60/013b6a211b33560666b55f011a60f9771da63b60.tar.gz', '/personal/test/init_and_run2/Al.STRU.02x01x01/00.place_ele/013b6a211b33560666b55f011a60f9771da63b60.tar.gz']: b'rsync: chown "/personal/test/init_and_run2/Al.STRU.02x01x01/00.place_ele/.013b6a211b33560666b55f011a60f9771da63b60.tar.gz.JIoelN" failed: Operation not permitted (1)\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1677) [generator=3.1.3]\n'
This issue may relate to directory right of Bohrium "/personal". When I run this test on others path, it will work.
Try no-o
. I guess no-g
may also be required. Below is the explanation.
-r, --recursive recurse into directories
-l, --links copy symlinks as symlinks
-p, --perms preserve permissions
-t, --times preserve modification times
-o, --owner preserve owner (super-user only)
-g, --group preserve group
-D same as --devices --specials
--devices preserve device files (super-user only)
--specials preserve special files
-a
is equivalent to -rltpgoD
I transfer the issue to dpdispatcher as it's more related.
Bug summary
I use dpgen to submit a dpgen job to run the fp on SUGON platform, the fp is like:
The fp job can be submitted to sugon and run abacus successfully, but it throw the below warning when dpgen get the returned results:
It seems that rsync try to do
chown
action, but it is failed.DP-GEN Version
0.11.1.dev51+gbea559b
Platform, Python Version, Remote Platform, etc
Platform: bohrium
Python: 3.8.8
Remote Platform: Sugon
Input Files, Running Commands, Error Log, etc.
dpgen.zip Need an extra Sugon secret file named as "sugon". command:
dpgen init_bulk init.json machine.json
Steps to Reproduce
Further Information, Files, and Links
No response