cpnr / computing

0 stars 0 forks source link

cvmfs 접근 시 os error 5 #60

Open slowmoyang opened 2 weeks ago

slowmoyang commented 2 weeks ago

singularity container 안에서 gridpack 생성 중 python의 OSError 발생함.

Command "import /users/hep/slowmoyang/work/tzq/gridpack/genproductions/bin/MadGraph5_aMCatNLO/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm_gridpack/work/TZ
QB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm_proc_card.dat" interrupted in sub-command:                                                                                                                     
"output TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm -nojpeg" with error:                                                                                                                                 
OSError : [Errno 5] Input/output error 
Please report this bug on https://bugs.launchpad.net/mg5amcnlo                                                                                                                                              
More information is found in 'MG5_debug'.                                                                                                                                                                   
Please attach this file to your report.                                                                                                                                                                     
command not executed: display multiparticles                                                                                                                                                                
Checking if MG5 is up-to-date... (takes up to 5s)                                                                                                                                                           
failed to connect server                                                                                                                                                                                    
quit                                                                                                                                                                                                        
Process output directory TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm not found.  Either process generation failed, or the name of the output did not match the process name TZQB-Zto2L-4FS_MLL-50_amcatnl
o-loop_qcd_qed_sm provided to the script.                                                                                                                                                                   
END: Tue Oct 15 03:10:44 PM KST 2024

ME5_debug는 다음과 같음. MG5_debug.txt

# -snip-
  File "/users/hep/slowmoyang/work/tzq/gridpack/genproductions/bin/MadGraph5_aMCatNLO/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm_gridpack/work/MG5_aMC_v3
_5_6/madgraph/iolibs/export_v4.py", line 9329, in ExportV4Factory                                                                                                                                           
    cmd.install_reduction_library()                                                                                                                                                                         
  File "/users/hep/slowmoyang/work/tzq/gridpack/genproductions/bin/MadGraph5_aMCatNLO/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm_gridpack/work/MG5_aMC_v3
_5_6/madgraph/interface/loop_interface.py", line 527, in install_reduction_library                                                                                                                          
    to_install = self.ask('install', '0',  ask_class=AskLoopInstaller, timeout=300,                                                                                                                         
  File "/users/hep/slowmoyang/work/tzq/gridpack/genproductions/bin/MadGraph5_aMCatNLO/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm_gridpack/work/MG5_aMC_v3
_5_6/madgraph/interface/extended_cmd.py", line 1113, in ask                                                                                                                                                 
    question_instance = obj(question, allow_arg=choices, default=default,                                                                                                                                   
  File "/users/hep/slowmoyang/work/tzq/gridpack/genproductions/bin/MadGraph5_aMCatNLO/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm/TZQB-Zto2L-4FS_MLL-50_amcatnlo-loop_qcd_qed_sm_gridpack/work/MG5_aMC_v3
_5_6/madgraph/interface/loop_interface.py", line 946, in __init__                                                                                                                                           
    response=six.moves.urllib.request.urlopen('http://madgraph.phys.ucl.ac.be/F1.html', timeout=3)                                                                                                          
  File "/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/py3-six/1.16.0-191eaed0649a04458bb2775b1b0a08ae/lib/python3.9/site-packages/six.py", line 97, in __get__                                                
    result = self._resolve()                                                                                                                                                                                
  File "/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/py3-six/1.16.0-191eaed0649a04458bb2775b1b0a08ae/lib/python3.9/site-packages/six.py", line 165, in _resolve                                              
    module = _import_module(self.mod)                                                                                                                                                                       
  File "/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/py3-six/1.16.0-191eaed0649a04458bb2775b1b0a08ae/lib/python3.9/site-packages/six.py", line 87, in _import_module                                         
    __import__(name)                                                                                                                                                                                        
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load                                                                                                                                        
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked                                                                                                                                
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked                                                                                                                                         
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module                                                                                                                                   
  File "<frozen importlib._bootstrap_external>", line 982, in get_code                                                                                                                                      
  File "<frozen importlib._bootstrap_external>", line 1040, in get_data                                                                                                                                     
OSError: [Errno 5] Input/output error                                                                                                                                                                       
Related File: None                                                                                                                                                                                          
                          MadGraph5_aMC@NLO Options 
# -snip-

우선 lugiahep에서 singularity 테스트 해보니 매우 관련 있어보이는 에러 발생

$ /cvmfs/cms.cern.ch/common/cmssw-el8                                                                                                                                                                   
exec: Failed to execute process '/cvmfs/cms.cern.ch/common/cmssw-el8', unknown error number 5
$ /cvmfs/cms.cern.ch/common/cmssw-el8 <<EOF
echo "hi"
EOF
bash: /cvmfs/cms.cern.ch/common/cmssw-el8: Input/output error
slowmoyang commented 2 weeks ago
$ /cvmfs/cms.cern.ch/common/cmssw-el8
exec: Failed to execute process '/cvmfs/cms.cern.ch/common/cmssw-el8', unknown error number 5
$ ls /cvmfs/cms.cern.ch/el9_amd64_gcc11/
"/cvmfs/cms.cern.ch/el9_amd64_gcc11/": Input/output error (os error 5)

다음을 참조 바람.

slowmoyang commented 2 weeks ago

cvmfs_config reload를 수행했으나 /cvmfs/cms.cern.ch/의 하위 파일이나 폴더에 접근하는 것은 오랜 latency와 함께 input/output error를 준다.

$ cvmfs_config  status                                                                                                                 
cvmfs-config.cern.ch mounted on /cvmfs/cvmfs-config.cern.ch with pid 2816621                                                                       
grid.cern.ch mounted on /cvmfs/grid.cern.ch with pid 2827684                                                                                       
unpacked.cern.ch mounted on /cvmfs/unpacked.cern.ch with pid 2835491                                                                               
sft.cern.ch mounted on /cvmfs/sft.cern.ch with pid 2843862                                                                                         
cms.cern.ch mounted on /cvmfs/cms.cern.ch with pid 2854641                                                                                         
cms-ib.cern.ch mounted on /cvmfs/cms-ib.cern.ch with pid 2862528                                                                                   
geant4.cern.ch mounted on /cvmfs/geant4.cern.ch with pid 2870377

$ cvmfs_config reload
Pausing cms.cern.ch on /cvmfs/cms.cern.ch
cms.cern.ch: Connecting to CernVM-FS loader... done                                                                                                
cms.cern.ch: Entering maintenance mode                                                                                                             
cms.cern.ch: Draining out kernel caches (up to 60s)  
cms.cern.ch: Blocking new file system calls                                                                                                        
cms.cern.ch: Waiting for active file system calls                                                                                                  
cms.cern.ch: Saving negative entry cache                                                                                                           
cms.cern.ch: Saving page cache entry tracker                                                                                                       
cms.cern.ch: Saving chunk tables                                                                                                                   
cms.cern.ch: Saving inode generation                                                                                                               
cms.cern.ch: Saving fuse state 
cms.cern.ch: Saving open files table                                                                                                               
cms.cern.ch: Unloading Fuse module                                                                                                                 
cms.cern.ch: Waiting for the delivery of SIGUSR1...
cms.cern.ch: Re-Loading Fuse module
cms.cern.ch: Restoring dentry tracker...  done                                                                                                     
cms.cern.ch: Restoring page cache entry tracker...  done                                                                                           
cms.cern.ch: Restoring chunk tables...  done                                                                                                       
cms.cern.ch: Restoring inode generation...  done                                                                                                   
cms.cern.ch: Restoring fuse state...  done                                                                                                         
cms.cern.ch: Restoring open files table... done                                                                                                    
cms.cern.ch: Restoring open files counter...  done                                                                                                 
cms.cern.ch: Releasing saved dentry tracker                                                                                                        
cms.cern.ch: Releasing saved page cache entry cache                                                                                                
cms.cern.ch: Releasing chunk tables                                                                                                                
cms.cern.ch: Releasing saved inode generation info                                                                                                 
cms.cern.ch: Releasing fuse state                                                                                                                  
cms.cern.ch: Releasing saved open files table                                                                                                      
cms.cern.ch: Releasing open files counter                                                                                                          
cms.cern.ch: Activating Fuse module

$ ls /cvmfs/cms.cern.ch/el9_amd64_gcc11/
ls: cannot open directory '/cvmfs/cms.cern.ch/el9_amd64_gcc11/': Input/output error
jhgoh commented 2 weeks ago

kisti의 squid 에서 내용을 가져오지 못하는 게 아닐까 의심됨. DIRECT로 바꿔서 cvmfs_config reload 하고 다시 ls 해 보니 일단 동작함.