Open albertdyu opened 2 years ago
@albertdyu, it looks like there's a problem reading the values for the 1680th training interval. If you look at the output files that are generated by this command you should be able to find an intervals/ folder which will contain a file with train
in the name that should gave 1691 rows. You can look at the 1680th row in this file. It could be an issue with this chromosome name not being included in your bigwig file, or your bigwig file being produced using a genome assembly different from dm6.chrom.sizes
.
That did the trick! I accidentally used a sizes file with slightly different chromosome sizes relative to the assembly I mapped to - good catch! Thank you!
I was able to successfully train a model, but now I'm having some trouble with denoising, haha...
atacworks denoise \ --noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw \ --genome gfachrome.sizes \ --weights_path ./856_train_latest/model_best.pth.tar \ --out_home "./" \ --exp_name "856_ZT14_2_denoise" \ --distributed \ --num_workers 0
Process Process-2:
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 217, in writer
if not res_queue.empty():
File "
It outputs a half-finished bedgraph file before going down.
I'm running:
CUDA 11.6 PyTorch 1.7.1 Python 3.6.7
Any thoughts? I appreciate your help!
Hi there,
I'm running atacworks on some Drosophila ATAC-seq data we had generated, and it's throwing an error at me!
atacworks train \ --noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw \ --cleanbw ClkZT14_1.sort.bam.cutsites.smoothed_100.coverage.bw \ --cleanpeakfile ClkZT14_1.sort.bam.peaks.bed \ --genome dm6.chrom.sizes \ --val_chrom chr2R \ --holdout_chrom chr3L \ --out_home "./" \ --exp_name "856_train" \ --distributed
INFO:2022-05-22 17:50:19,291:AtacWorks-peak2bw] Reading input file INFO:2022-05-22 17:50:19,295:AtacWorks-peak2bw] Read 15265 peaks. INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Adding score INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Writing peaks to bedGraph file Discarding 0 entries outside sizes file. INFO:2022-05-22 17:50:19,335:AtacWorks-peak2bw] Writing peaks to bigWig file ./856_train_2022.05.22_17.50/bigwig_peakfiles/ClkZT14_1.sort.bam.peaks.bed.bw INFO:2022-05-22 17:50:19,364:AtacWorks-peak2bw] Done! INFO:2022-05-22 17:50:19,367:AtacWorks-intervals] Generating training intervals INFO:2022-05-22 17:50:20,831:AtacWorks-intervals] Generating val intervals INFO:2022-05-22 17:50:20,840:AtacWorks-bw2h5] Reading intervals INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Read 1691 intervals INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Selecting intervals with nonzero coverage Traceback (most recent call last): File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in
sys.exit(main())
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 409, in main
prefix, args.pad)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bw2h5.py", line 73, in bw2h5
intervals, noisybw)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bigwigio.py", line 106, in check_bigwig_intervals_nonzero
check_bigwig_nonzero, axis=1, args=(bw,))
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/frame.py", line 6906, in apply
return op.get_result()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 186, in get_result
return self.apply_standard()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 292, in apply_standard
self.apply_series_generator()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 321, in apply_series_generator
results[i] = self.f(v)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 112, in f
return func(x, *args, **kwds)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bigwigio.py", line 90, in check_bigwig_nonzero
result = bw.values(interval[0], interval[1], interval[2])
RuntimeError: ('Invalid interval bounds!', 'occurred at index 1680')
Any thoughts on what could be going wrong?