NVIDIA-Genomics-Research / AtacWorks

Deep learning based processing of Atac-seq data
https://clara-parabricks.github.io/AtacWorks/
Other
128 stars 23 forks source link

Error running atacworks train #248

Open albertdyu opened 2 years ago

albertdyu commented 2 years ago

Hi there,

I'm running atacworks on some Drosophila ATAC-seq data we had generated, and it's throwing an error at me!

atacworks train \ --noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw \ --cleanbw ClkZT14_1.sort.bam.cutsites.smoothed_100.coverage.bw \ --cleanpeakfile ClkZT14_1.sort.bam.peaks.bed \ --genome dm6.chrom.sizes \ --val_chrom chr2R \ --holdout_chrom chr3L \ --out_home "./" \ --exp_name "856_train" \ --distributed

INFO:2022-05-22 17:50:19,291:AtacWorks-peak2bw] Reading input file INFO:2022-05-22 17:50:19,295:AtacWorks-peak2bw] Read 15265 peaks. INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Adding score INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Writing peaks to bedGraph file Discarding 0 entries outside sizes file. INFO:2022-05-22 17:50:19,335:AtacWorks-peak2bw] Writing peaks to bigWig file ./856_train_2022.05.22_17.50/bigwig_peakfiles/ClkZT14_1.sort.bam.peaks.bed.bw INFO:2022-05-22 17:50:19,364:AtacWorks-peak2bw] Done! INFO:2022-05-22 17:50:19,367:AtacWorks-intervals] Generating training intervals INFO:2022-05-22 17:50:20,831:AtacWorks-intervals] Generating val intervals INFO:2022-05-22 17:50:20,840:AtacWorks-bw2h5] Reading intervals INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Read 1691 intervals INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Selecting intervals with nonzero coverage Traceback (most recent call last): File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in sys.exit(main()) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 409, in main prefix, args.pad) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bw2h5.py", line 73, in bw2h5 intervals, noisybw) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bigwigio.py", line 106, in check_bigwig_intervals_nonzero check_bigwig_nonzero, axis=1, args=(bw,)) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/frame.py", line 6906, in apply return op.get_result() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 186, in get_result return self.apply_standard() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 292, in apply_standard self.apply_series_generator() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 321, in apply_series_generator results[i] = self.f(v) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 112, in f return func(x, *args, **kwds) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bigwigio.py", line 90, in check_bigwig_nonzero result = bw.values(interval[0], interval[1], interval[2]) RuntimeError: ('Invalid interval bounds!', 'occurred at index 1680')

Any thoughts on what could be going wrong?

avantikalal commented 2 years ago

@albertdyu, it looks like there's a problem reading the values for the 1680th training interval. If you look at the output files that are generated by this command you should be able to find an intervals/ folder which will contain a file with train in the name that should gave 1691 rows. You can look at the 1680th row in this file. It could be an issue with this chromosome name not being included in your bigwig file, or your bigwig file being produced using a genome assembly different from dm6.chrom.sizes.

albertdyu commented 2 years ago

That did the trick! I accidentally used a sizes file with slightly different chromosome sizes relative to the assembly I mapped to - good catch! Thank you!

I was able to successfully train a model, but now I'm having some trouble with denoising, haha...

atacworks denoise \ --noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw \ --genome gfachrome.sizes \ --weights_path ./856_train_latest/model_best.pth.tar \ --out_home "./" \ --exp_name "856_ZT14_2_denoise" \ --distributed \ --num_workers 0

INFO:2022-05-22 19:19:49,827:AtacWorks-intervals] Generating intervals tiling across all chromosomes in sizes file: gfachrome.sizes INFO:2022-05-22 19:19:49,841:AtacWorks-intervals] Done! INFO:2022-05-22 19:19:49,841:AtacWorks-bw2h5] Reading intervals INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Read 2747 intervals INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Writing data in 3 batches. INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Extracting data for each batch and writing to h5 file INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] batch 0 of 3 INFO:2022-05-22 19:19:58,719:AtacWorks-bw2h5] Done! Saved to ./856_ZT14_2_denoise_2022.05.22_19.19/bw2h5/ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw.denoise.h5 INFO:2022-05-22 19:19:58,719:AtacWorks-main] Checking input files for compatibility Building model: resnet ... Loading model weights from ./856_train_latest/model_best.pth.tar... Finished loading. Finished building. Inference -------------------- [ 0/2747] Inference -------------------- [ 50/2747] Inference #------------------- [ 100/2747] Inference #------------------- [ 150/2747] Inference #------------------- [ 200/2747] Inference ##------------------ [ 250/2747] Inference ##------------------ [ 300/2747] Inference ###----------------- [ 350/2747] Inference ###----------------- [ 400/2747] Inference ###----------------- [ 450/2747] Inference ####---------------- [ 500/2747] Inference ####---------------- [ 550/2747] Inference ####---------------- [ 600/2747] Inference #####--------------- [ 650/2747] Inference #####--------------- [ 700/2747] Inference #####--------------- [ 750/2747] Inference ######-------------- [ 800/2747] Inference ######-------------- [ 850/2747] Inference #######------------- [ 900/2747] Inference #######------------- [ 950/2747] Inference #######------------- [1000/2747] Inference ########------------ [1050/2747] Inference ########------------ [1100/2747] Traceback (most recent call last): File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in sys.exit(main()) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 569, in main worker(args.gpu_idx, ngpus_per_node, args, res_queue) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/worker.py", line 290, in infer_worker pad=args.pad) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/dl4atac/infer.py", line 80, in infer res_queue.put((idxes, batch_res)) File "", line 2, in put File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 772, in _callmethod raise convert_to_error(kind, result) multiprocessing.managers.RemoteError:

Traceback (most recent call last): File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 228, in serve_client request = recv() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 251, in recv return _ForkingPickler.loads(buf.getbuffer()) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd fd = df.detach() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/resource_sharer.py", line 58, in detach return reduction.recv_handle(conn) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/reduction.py", line 182, in recv_handle return recvfds(s, 1)[0] File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/reduction.py", line 161, in recvfds len(ancdata)) RuntimeError: received 0 items of ancdata

Process Process-2: Traceback (most recent call last): File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 217, in writer if not res_queue.empty(): File "", line 2, in empty File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 756, in _callmethod conn.send((self._id, methodname, args, kwds)) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 206, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes self._send(header + buf) File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe

It outputs a half-finished bedgraph file before going down.

I'm running:

CUDA 11.6 PyTorch 1.7.1 Python 3.6.7

Any thoughts? I appreciate your help!