daimh / sge

Some Grid Engine/Son of Grid Engine/Sun Grid Engine
90 stars 34 forks source link

The issue about file cannot be found #11

Closed honghh2018 closed 2 years ago

honghh2018 commented 2 years ago

Hi, developer, Thanks for the great tools for jobs management. I add a new node into the sge server group, and i test the new node to throw a task - echo "sleep 120 |qsub -cwd -V -q all.q@studynode.local" but it get below error: I confirm the nfs work well, and can see the path in studynode.local node host. But when i test the echo "sleep 120 |qsub -cwd -V -q all.q@studynode.local" ,it show me Eqw and below issue: error: can't open output file "/share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis/STDIN.o12273": Permission denied

It was very weird that home directory in /share/nas1 was normal to throw a task into all.q@studynode.local but if home directory in /share/disk1 was failed, error lying below: whether or not was it relative with the home directory?

Detail information:

job_number: 12273 exec_file: job_scripts/12273 submission_time: Thu Jan 27 09:34:30 2022 owner: xiecs uid: 1007 group: xiecs gid: 1010 sge_o_home: /share/disk1/Data/Users/xiecs sge_o_log_name: xiecs sge_o_path: /home/honghh/perl5/bin:/opt/sysoft/jdk-14/bin:/usr/local/python3/bin/:/home/honghh/miniconda3/bin//share/nas1/Data/software/gcc-7.3.0_compile/bin:/opt/biosoft/cdhit/cdhit-4.8.1:/opt/biosoft/GenomeUtility/bin/:/opt/sysoft/node-v10.13.0-linux-x64/bin/:/opt/biosoft/T_BCR/mixcr-3.0.13/:/share/nas1/Data/Users/honghh/Software/kofamscan/bin/ruby/bin:/opt/sysoft/jdk-14/bin:/opt/biosoft/bedtools2.29.2/bin:/opt/biosoft/cellranger-atac/:/home/honghh/.aspera/connect/bin/:/share/nas1/Data/software/mini_conda/Miniconda3/envs/Deeptools/bin/link_bin:/opt/sysoft/Python-3.7.0/bin:/opt/biosoft/ceftools/:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/bin:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/jre/bin:/opt/sysoft/sge/bin:/opt/sysoft/sge/bin/lx-amd64:/usr/local/rvm/gems/ruby-2.7.1/bin:/usr/local/rvm/gems/ruby-2.7.1@global/bin:/usr/local/rvm/rubies/ruby-2.7.1/bin:/usr/lib64/qt-3.3/bin:/usr/local/mysql/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/rvm/bin:/opt/biosoft/bowtie:/opt/biosoft/cellranger-expression:/root/software/gdal-3.0.2/apps:/opt/biosoft/hisat2-2.1.0:/opt/biosoft/STAR-2.7.3/bin/Linux_x86_64:/opt/biosoft/bowtie2/:/opt/biosoft/ncbi-blast2.9.0/bin:/opt/biosoft/FastQC:/opt/biosoft/sratoolkit.2.9.6-1/bin:/opt/biosoft/spaceranger-1.0.0/bin:/opt/biosoft/cellranger-dna-1.1.0:/usr/lib/rstudio-server/bin/pandoc/:/opt/biosoft/Diamond:/opt/biosoft/OSS:/opt/biosoft/plink_x86_64_20200616/bin/:/opt/biosoft/brew/bin:/share/nas1/Data/software/snpsea/snpsea/bin:/share/nas1/Data/software/git_2.32.0/bin:/home/honghh/Software/samtools/bin:/share/disk1/Data/Users/xiecs/.local/bin:/share/disk1/Data/Users/xiecs/bin sge_o_shell: /bin/bash sge_o_workdir: /share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis sge_o_host: master account: sge cwd: /share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis mail_list: xiecs@master.local notify: FALSE job_name: STDIN jobshare: 0 hard_queue_list: all.q@studynode.local env_list: TERM=xterm,XDG_SESSION_ID=2028,rvm_bin_path=/usr/local/rvm/bin,HOSTNAME=Master,GEM_HOME=/usr/local/rvm/gems/ruby-2.7.1,SHELL=/bin/bash,HISTSIZE=1000,IRBRC=/usr/local/rvm/rubies/ruby-2.7.1/.irbrc,SSH_CLIENT=192.168.3.104 59025 22,PERL5LIB=/home/honghh/miniconda3/envs/cellassign/lib/site_perl/5.26.2/:/home/honghh/perl5/lib/perl5,SGE_CELL=default1,SGE_ARCH=lx-amd64,QTDIR=/usr/lib64/qt-3.3,OLDPWD=/share/disk4/Data/Users/xiecs/standard/2022-01-26,PERL_MB_OPT=--install_base "/home/honghh/perl5",MY_RUBY_HOME=/usr/local/rvm/rubies/ruby-2.7.1,QTINC=/usr/lib64/qt-3.3/include,SSH_TTY=/dev/pts/1,QT_GRAPHICSSYSTEM_CHECKED=1,JRE_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/jre,USER=xiecs,LD_LIBRARY_PATH=/share/nas2/public/R/library/3.6/:/home/honghh/R/x86_64-redhat-linux-gnu-library/3.6/:/usr/lib64:/share/nas1/Data/software/library/JAGS/lib:/share/nas1/Data/software/gcc-7.3.0_compile/lib64::/usr/local/lib:/home/honghh/miniconda3/pkgs/cudatoolkit-11.0.221-h6bb024c_0/lib,LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.jpg=01;35:.jpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.axv=01;35:.anx=01;35:.ogv=01;35:.ogx=01;35:.aac=01;36:.au=01;36:.flac=01;36:.mid=01;36:.midi=01;36:.mka=01;36:.mp3=01;36:.mpc=01;36:.ogg=01;36:.ra=01;36:.wav=01;36:.axa=01;36:.oga=01;36:.spx=01;36:*.xspf=01;36:,rvm_path=/usr/local/rvm,TMOUT=28800,rvm_prefix=/usr/local,MAIL=/var/spool/mail/xiecs,PATH=/home/honghh/perl5/bin:/opt/sysoft/jdk-14/bin:/usr/local/python3/bin/:/home/honghh/miniconda3/bin//share/nas1/Data/software/gcc-7.3.0_compile/bin:/opt/biosoft/cdhit/cdhit-4.8.1:/opt/biosoft/GenomeUtility/bin/:/opt/sysoft/node-v10.13.0-linux-x64/bin/:/opt/biosoft/T_BCR/mixcr-3.0.13/:/share/nas1/Data/Users/honghh/Software/kofamscan/bin/ruby/bin:/opt/sysoft/jdk-14/bin:/opt/biosoft/bedtools2.29.2/bin:/opt/biosoft/cellranger-atac/:/home/honghh/.aspera/connect/bin/:/share/nas1/Data/software/mini_conda/Miniconda3/envs/Deeptools/bin/link_bin:/opt/sysoft/Python-3.7.0/bin:/opt/biosoft/ceftools/:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/bin:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/jre/bin:/opt/sysoft/sge/bin:/opt/sysoft/sge/bin/lx-amd64:/usr/local/rvm/gems/ruby-2.7.1/bin:/usr/local/rvm/gems/ruby-2.7.1@global/bin:/usr/local/rvm/rubies/ruby-2.7.1/bin:/usr/lib64/qt-3.3/bin:/usr/local/mysql/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/rvm/bin:/opt/biosoft/bowtie:/opt/biosoft/cellranger-expression:/root/software/gdal-3.0.2/apps:/opt/biosoft/hisat2-2.1.0:/opt/biosoft/STAR-2.7.3/bin/Linux_x86_64:/opt/biosoft/bowtie2/:/opt/biosoft/ncbi-blast2.9.0/bin:/opt/biosoft/FastQC:/opt/biosoft/sratoolkit.2.9.6-1/bin:/opt/biosoft/spaceranger-1.0.0/bin:/opt/biosoft/cellranger-dna-1.1.0:/usr/lib/rstudio-server/bin/pandoc/:/opt/biosoft/Diamond:/opt/biosoft/OSS:/opt/biosoft/plink_x86_64_20200616/bin/:/opt/biosoft/brew/bin:/share/nas1/Data/software/snpsea/snpsea/bin:/share/nas1/Data/software/git_2.32.0/bin:/home/honghh/Software/samtools/bin:/share/disk1/Data/Users/xiecs/.local/bin:/share/disk1/Data/Users/xiecs/bin,PWD=/share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis,JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.232.b09-0.el7_7.x86_64/bin/:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64,LANG=zh_CN.UTF-8,SGE_ROOT=/opt/sysoft/sge,KDEDIRS=/usr,CPPFLAGES=-I/opt/sysoft/Python-3.7.0/Include/,HISTCONTROL=ignoredups,rvm_version=1.29.10 (manual),SHLVL=1,HOME=/share/disk1/Data/Users/xiecs,PERL_LOCAL_LIB_ROOT=/home/honghh/perl5,LOGNAME=xiecs,QTLIB=/usr/lib64/qt-3.3/lib,CLASSPATH=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/lib:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64/jre/lib:,GEM_PATH=/usr/local/rvm/gems/ruby-2.7.1:/usr/local/rvm/gems/ruby-2.7.1@global,SSH_CONNECTION=192.168.3.104 59025 192.168.3.33 22,PKG_CONFIG_PATH=/usr/lib64/pkgconfig,LESSOPEN=||/usr/bin/lesspipe.sh %s,SGE_CLUSTER_NAME=p6444,XDG_RUNTIME_DIR=/run/user/1007,QT_PLUGIN_PATH=/usr/lib64/kde4/plugins:/usr/lib/kde4/plugins,RUBY_VERSION=ruby-2.7.1,PERL_MM_OPT=INSTALLBASE=/home/honghh/perl5,=/opt/sysoft/sge/bin/lx-amd64/qsub script_file: STDIN binding: NONE job_type: NONE error reason 1: 01/27/2022 09:34:44 [1003:55175]: error: can't open output file "/share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis/STDIN.o12273": Permission denied scheduling info: (Collecting of scheduler job information is turned off)

Best Hope you help to fix this issue Hanhuihong

daimh commented 2 years ago

Can you please run the command below to test the permission?

touch /share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis/hello_world

honghh2018 commented 2 years ago

Yes, it is. the command can generate the hello_world file posting below: f02adf67685e5913b1d0bb536f587aa

This issue do indeed peculiar, because the other user can work normally but xiecs user.

honghh2018 commented 2 years ago

The home directory for xiecs user was : /share/disk1/Data/Users/xiecs Other users' home directory were: /share/nas1/Data/Users/ Does it relation with this issue?

daimh commented 2 years ago

Yes, this was caused by your filesystem permission.

Can you try to run the command below on ALL nodes, even including master node?

touch /share/disk4/Data/Users/xiecs/standard/2022-01-26/Analysis/$HOSTNAME

honghh2018 commented 2 years ago

The xiecs user can not be login the studynode host, because the host had same name xiecs on it.

honghh2018 commented 2 years ago

The other user luohb can not be login the studynode, but it can work normally

honghh2018 commented 2 years ago

I had solved this issue with change the host UID. thanks for the help.

daimh commented 2 years ago

Thanks a million for the update!