Open rl4940 opened 1 year ago
Numpy那个应该是没有问题了,这次走到21% 了哈哈 但是出现了cuDNN的error, 这个好像是pytorch的报错,我确实不知道咋搞 ↓我的code
#!/bin/bash #SBATCH --mail-type=END,FAIL #SBATCH --nodes=1 #SBATCH --ntasks=2 #SBATCH --cpus-per-task=2 #SBATCH --time=02:00:00 #SBATCH --mem=48G #SBATCH --gres=gpu:a100:1 #SBATCH -o %A_%a_output.txt #SBATCH -e %A_%a_error.txt CUDA_VISIBLE_DEVICES=0 ccsmeth call_mods \ --input 121A/mapped.bam \ --ref 121A/assembly.rotated.polished.renamed.fsa \ --model_file /ccsmeth/models/model_ccsmeth_5mCpG_call_mods_attbigru2s_b21.v2.ckpt \ --output output.hifi.pbmm2.call_mods \ --threads 10 --threads_call 2 --model_type attbigru2s \ --rm_per_readsite --mode align
↓ error.txt
batch_reader: 21%|██ | 1941/9340 [03:23<24:18, 5.07it/s] batch_reader: 21%|██ | 1944/9340 [03:24<28:32, 4.32it/s] batch_reader: 21%|██ | 1949/9340 [03:25<27:37, 4.46it/s] batch_reader: 21%|██ | 1953/9340 [03:26<28:56, 4.25it/s] batch_reader: 21%|██ | 1957/9340 [03:27<29:52, 4.12it/s] batch_reader: 21%|██ | 1962/9340 [03:28<28:35, 4.30it/s] batch_reader: 21%|██ | 1968/9340 [03:29<26:08, 4.70it/s] batch_reader: 21%|██ | 1973/9340 [03:30<26:06, 4.70it/s] batch_reader: 21%|██ | 1979/9340 [03:31<24:40, 4.97it/s] batch_reader: 21%|██▏ | 1985/9340 [03:33<23:47, 5.15it/s] batch_reader: 21%|██▏ | 1990/9340 [03:34<24:27, 5.01it/s] batch_reader: 21%|██▏ | 1996/9340 [03:35<23:36, 5.18it/s] batch_reader: 21%|██▏ | 2001/9340 [03:36<24:16, 5.04it/s]Process Process-6: Process Process-4: Traceback (most recent call last): Traceback (most recent call last): File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 340, in _call_mods_q pred_str, accuracy, batch_num = _call_mods2s(features_batch, model, args.batch_size, device) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 246, in _call_mods2s voutputs, vlogits = model(FloatTensor(b_fkmers, device), FloatTensor(b_fpasss, device), File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 340, in _call_mods_q pred_str, accuracy, batch_num = _call_mods2s(features_batch, model, args.batch_size, device) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/models.py", line 118, in forward out1, n_states1 = self.rnn(out1, self.init_hidden(out1.size(0), File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/call_modifications.py", line 246, in _call_mods2s voutputs, vlogits = model(FloatTensor(b_fkmers, device), FloatTensor(b_fpasss, device), File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 942, in forward result = _VF.gru(input, hx, self._flat_weights, self.bias, self.num_layers, File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/ccsmeth/models.py", line 118, in forward out1, n_states1 = self.rnn(out1, self.init_hidden(out1.size(0), File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED File "/gpfs/data/pirontilab/Students/software/conda/envs/ccsmeth/lib/python3.10/site-packages/torch/nn/modules/rnn.py", line 942, in forward result = _VF.gru(input, hx, self._flat_weights, self.bias, self.num_layers, RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
o(╥﹏╥)o
跑这个CUDA_VISIBLE_DEVICES=0 ccsmeth call_mods需要多久啊 我设置的10个线程为什么把我的cpu直接占满了呢
我觉得优化有问题,机器学习优化应该是没搞好
Numpy那个应该是没有问题了,这次走到21% 了哈哈 但是出现了cuDNN的error, 这个好像是pytorch的报错,我确实不知道咋搞 ↓我的code
↓ error.txt
o(╥﹏╥)o