XiaoTaoWang / EagleC

A deep-learning framework for predicting a full range of structural variations from bulk and single-cell contact maps
Other
51 stars 8 forks source link

OSError: Unable to open file and subprocess.CalledProcessError: #1

Open Bo-UT opened 2 years ago

Bo-UT commented 2 years ago

Hi, I got the OSError and CalledProcessError.

The file in bulk/ is 50M-100M.zip, but this file cannot be unzipped.

I first downloaded SKNAS-MboI-allReps-filtered.mcool, and submited the job in cluster only with predictSV. predictSV --hic-5k SKNAS-MboI-allReps-filtered.mcool::/resolutions/5000 \ --hic-10k SKNAS-MboI-allReps-filtered.mcool::/resolutions/10000 \ --hic-50k SKNAS-MboI-allReps-filtered.mcool::/resolutions/50000 \ -O SK-N-AS -g hg38 --balance-type CNV --output-format full \ --prob-cutoff-5k 0.8 --prob-cutoff-10k 0.8 --prob-cutoff-50k 0.99999

It looks like predictSV-single-resolution was also called. Could you help to figure out the errors? Thanks in advance!

OSError: Unable to open file (unable to open file: name = '/rsrch4/home/genomic_med/bzhao2/.conda/envs/EagleC/lib/python3.8/site-packages/eaglec/data/bulk/50M-100M/CNN-weights.0.1.0.4.0.6.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0) Traceback (most recent call last): File "/home/bzhao2/.conda/envs/EagleC/bin/predictSV", line 176, in run() File "/home/bzhao2/.conda/envs/EagleC/bin/predictSV", line 112, in run subprocess.check_call(' '.join(command), shell=True) File "/rsrch4/home/genomic_med/bzhao2/.conda/envs/EagleC/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'predictSV-single-resolution -H SKNAS-MboI-allReps-filtered.mcool::/resolutions/5000 --balance-type CNV -O SK-N-AS.CNN_SVs.5K.txt --genome hg38 --output-format full -C "#" "X" --prob-cutoff 0.8 --logFile eaglec.log' returned non-zero exit status 1.

XiaoTaoWang commented 2 years ago

Did you download the pre-trained models using "download-pretrained-models"?

Bo-UT commented 2 years ago

Thank you! Downloading pretrained models helps it run at the beginning, but later I got another error; Traceback (most recent call last): File "/home/bzhao2/.conda/envs/EagleC/bin/merge-multiple-resolutions", line 260, in run() File "/home/bzhao2/.conda/envs/EagleC/bin/merge-multiple-resolutions", line 158, in run SV_pool = [load_sv_full(fil) for fil in args.full_sv_files] File "/home/bzhao2/.conda/envs/EagleC/bin/merge-multiple-resolutions", line 158, in SV_pool = [load_sv_full(fil) for fil in args.full_sv_files] File "/home/bzhao2/.conda/envs/EagleC/bin/merge-multiple-resolutions", line 248, in load_sv_full with open(fil, 'r') as source: FileNotFoundError: [Errno 2] No such file or directory: 'SK-N-AS.CNN_SVs.5K.txt' Traceback (most recent call last): File "/home/bzhao2/.conda/envs/EagleC/bin/predictSV", line 176, in run() File "/home/bzhao2/.conda/envs/EagleC/bin/predictSV", line 159, in run subprocess.check_call(' '.join(command), shell=True) File "/rsrch4/home/genomic_med/bzhao2/.conda/envs/EagleC/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'merge-multiple-resolutions --hic-10k SKNAS-MboI-allReps-filtered.mcool::/resolutions/10000 --hic-5k SKNAS-MboI-allReps-filtered.mcool::/resolutions/5000 --balance-type CNV -C "#" "X" --full-sv-files SK-N-AS.CNN_SVs.5K.txt SK-N-AS.CNN_SVs.10K_highres.txt SK-N-AS.CNN_SVs.50K_highres.txt -O SK-N-AS.CNN_SVs.5K_combined.txt --buff-size 50000 --output-format full --cache-10k .SKNAS-MboI-allReps-filtered.mcool.70224652.CNV.None.100000.None --cache-5k .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None' returned non-zero exit status 1.

After the aborted running, I got following files:

.70224652.91006977.CNV.SK-N-AS.CNN_SVs.5K.txt.SK-N-AS.CNN_SVs.10K_highres.txt.SK-N-AS.CNN_SVs.50K_highres.txt.50000 eaglec.log myjob_eaglec_predict.lsf SK-N-AS.CNN_SVs.10K.txt SK-N-AS.CNN_SVs.50K_highres.txt SK-N-AS.CNN_SVs.50K.txt SKNAS-MboI-allReps-filtered.mcool .SKNAS-MboI-allReps-filtered.mcool.39586955.CNV.None.100000.None .SKNAS-MboI-allReps-filtered.mcool.70224652.CNV.None.100000.None .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.SK-N-AS.CNN_SVs.10K.txt.25000.None .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.SK-N-AS.CNN_SVs.50K.txt.110000.None

XiaoTaoWang commented 2 years ago

Could you delete the LOCK files below and re-submit your jobs?

$ rm .SKNAS-MboI-allReps-filtered.mcool*/*lock .70224652.91006977.*/*lock

As I explained in the documentation, EagleC creates lock files for the communications between different jobs. If a job is terminated unexpectedly, the lock file created by it may not be removed appropriately and will "tell" future jobs that it is still running, althrough the truth is that they have been killed/terminated.

Bo-UT commented 2 years ago

Thanks a lot for your quick response! I looked up the folders you mentioned, but didn't find the lock files generated, even the hidden files. Our cluster is disconnected by the school. I ran download-pretrained-models locally and uploaded the whole data file to the cluster. I am wondering if there are other settings to run the sample data.

XiaoTaoWang commented 2 years ago

I see. But that totally confused me. Can you show me the output of the command below:

$ ls -lh .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/*completed | wc -l

And this command as well?

$ ls -lh .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/*lock

Thanks!

Bo-UT commented 2 years ago

$ ls -lh .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/*completed | wc -l 276

$ ls -lh .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/lock ls: cannot access '.SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/lock': No such file or directory

XiaoTaoWang commented 2 years ago

how about this command?

$ ls -lh .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/*txt | wc -l
Bo-UT commented 2 years ago

I submitted a single job, not using "for i in {1..16}; do sbatch slurm-predictSV.sh; sleep 40s; done", and it works. I got 6 files as you described in the documentation.

$ head SK-N-AS.CNN_SVs.5K_combined.txt chrom1 pos1 chrom2 pos2 ++ +- -+ -- chr10 100540000 chr10 101175000 1.885e-15 4.558e-22 1 1.827e-16 chr11 100080000 chr11 100160000 1.319e-26 1 1.47e-23 1.292e-15 chr11 40120000 chr11 40300000 2.869e-13 7.797e-17 0.964 1.603e-17 chr11 71720000 chr17 32285000 3.397e-23 1 8.086e-15 1.674e-18 chr12 111605000 chr16 83395000 6.232e-29 1.972e-28 1 8.747e-27 chr13 63030000 chr17 22155000 1.812e-10 1.975e-16 0.9197 2.687e-12 chr16 21580000 chr16 22695000 1 4.339e-28 6.561e-27 1.242e-17 chr16 70805000 chr16 71160000 2.09e-09 0.9042 9.561e-10 9.592e-10 chr17 73790000 chr19 780000 1.392e-21 2.4e-29 2.071e-24 1

Bo-UT commented 2 years ago

$ ls -lh .SKNAS-MboI-allReps-filtered.mcool.91006977.CNV.None.100000.None/*txt | wc -l 276

I guess this maybe the result from last single job

XiaoTaoWang commented 2 years ago

So it shows that the SV calling for all the chromosomes has been finished. I have no idea why you didn't get the file "SK-N-AS.CNN_SVs.5K.txt", but can you submit the same "predictSV" command again?

Bo-UT commented 2 years ago

I submitted it again and it also worked. It looks like the predicted results are the same as the last submission.

$ cat SK-N-AS.CNN_SVs.5K_combined.txt

chrom1 pos1 chrom2 pos2 ++ +- -+ -- chr10 100540000 chr10 101175000 1.885e-15 4.558e-22 1 1.827e-16 chr11 100080000 chr11 100160000 1.319e-26 1 1.47e-23 1.292e-15 chr11 40120000 chr11 40300000 2.869e-13 7.797e-17 0.964 1.603e-17 chr11 71720000 chr17 32285000 3.397e-23 1 8.086e-15 1.674e-18 chr12 111605000 chr16 83395000 6.232e-29 1.972e-28 1 8.747e-27 chr13 63030000 chr17 22155000 1.812e-10 1.975e-16 0.9197 2.687e-12 chr16 21580000 chr16 22695000 1 4.339e-28 6.561e-27 1.242e-17 chr16 70805000 chr16 71160000 2.09e-09 0.9042 9.561e-10 9.592e-10 chr17 73790000 chr19 780000 1.392e-21 2.4e-29 2.071e-24 1 chr18 47755000 chr18 48025000 1.861e-13 3.204e-14 0.9863 1.928e-16 chr1 1930000 chr1 10975000 2.572e-25 1 1.017e-17 1.627e-20 chr1 72290000 chr1 72345000 2.235e-17 1 8.564e-23 4.629e-16 chr1 196745000 chr1 196845000 2.155e-15 0.9982 1.327e-15 3.209e-09 chr1 25255000 chr1 25330000 8.584e-19 0.8123 1.172e-19 4.559e-14 chr1 1765000 chr1 1905000 2.688e-11 1.744e-18 0.8671 6.763e-09 chr22 32305000 chr22 44240000 0.836 6.364e-08 1.359e-10 1.384e-12 chr2 111450000 chr2 111720000 1.519e-08 7.698e-15 0.9674 1.262e-12 chr3 60625000 chr17 42830000 6.303e-24 8.763e-27 7.642e-27 1 chr3 115820000 chr3 116295000 9.204e-27 1 4.731e-21 1.836e-26 chr3 162790000 chr3 162905000 9.531e-24 1 5.271e-19 2.616e-24 chr4 33835000 chr17 22110000 1.531e-19 1.964e-09 3.741e-20 0.9882 chr4 68500000 chr4 68625000 2.497e-14 0.9359 6.226e-14 1.305e-14 chr4 102895000 chr6 108640000 4.664e-17 5.354e-15 1 1.315e-19 chr4 173900000 chr8 128885000 0 1 1 0 chr5 98815000 chr5 98890000 2.411e-21 0.9071 5.536e-14 1.064e-21 chr5 1235000 chr5 51495000 1.829e-14 9.096e-23 2.403e-27 1 chr6 58110000 chr6 60845000 0.8166 1.478e-14 3.867e-11 0.7891 chr6 26020000 chr6 27895000 3.17e-15 8.751e-25 0.9759 1.039e-17 chr6 87385000 chrX 84750000 1.238e-12 4.459e-19 1.495e-21 1 chr7 55470000 chr22 35945000 1 8.522e-17 4.83e-15 7.54e-11 chr7 13270000 chr7 14695000 7.056e-15 5.27e-20 0.8482 2.84e-16 chr8 39370000 chr8 39525000 7.197e-21 0.9849 2.149e-21 3.857e-26 chr8 116855000 chr9 23795000 3.549e-10 1.687e-18 2.334e-25 1 chr9 129240000 chr9 129535000 1.35e-13 5.094e-14 0.8477 6.37e-18 chrX 55585000 chrX 118075000 6.833e-29 3.162e-15 2e-27 1 chrX 55580000 chrX 118315000 1 1.492e-20 4.899e-18 1.203e-26

XiaoTaoWang commented 2 years ago

Nice! Glad to hear you finally worked it out! Thanks for letting me know.

Bo-UT commented 2 years ago

Thanks so much for your help! Will let you know how it works with our own data.