Closed bio-ruxandra-tesloianu closed 3 years ago
hey sorry @bio-ruxandra-tesloianu somehow I missed seeing this-- did this get resolved somehow?
hey sorry @bio-ruxandra-tesloianu somehow I missed seeing this-- did this get resolved somehow?
no :(
Could you cat
the *logs/filterlogs*
files?
we recently made some tweaks to tenx
mode-- could you try running with version 0.6.0 (latest on Pypi) and let me know if anything gets better?
Hi @caleblareau,
I got the same error while running with version 0.6.1. Here is the output for cat
the *logs/filterlogs*
:
Kept 715654
Removed 5450
Kept 670028
Removed 5264
Kept 681359
Removed 5555
Kept 790015
Removed 6727
Kept 662046
Removed 6125
Kept 773335
Removed 6794
Kept 659653
Removed 5663
Kept 697615
Removed 6022
Kept 729339
Removed 6202
Kept 666933
Removed 5954
Kept 646504
Removed 5592
Kept 686152
And you're still getting an error associated with
SamtoolsError in line 93 of /home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/mgatk/bin/snake/Snakefile.tenx:
'samtools returned with error 1: stdout=, stderr=samtools sort: can't open
?
It looks like the filtering is working as we would expect but the resulting bam file isn't being output... Can you try the same execution with the bcall
mode and see if anything changes?
Hmm, so I also am running into this after installing mgatk (0.6.1) on a new system.
Error in rule process_one_slice:
jobid: 0
output: mgatk_out/qc/depth/barcodes.29.depth.txt, mgatk_out/temp/sparse_matrices/barcodes.29.A.txt, mgatk_out/temp/sparse_matrices/barcodes.29.C.txt, mgatk_out/temp/sparse_matrices/barcodes.29.G.txt, mgatk_out/temp/sparse_matrices/barcodes.29.T.txt, mgatk_out/temp/sparse_matrices/barcodes.29.coverage.txt
RuleException:
OSError in line 113 of /home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/mgatk/bin/snake/Snakefile.tenx:
No such file or directory: 'mgatk_out/temp/ready_bam/barcodes.29.qc.bam'
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2352, in run_wrapper
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/mgatk/bin/snake/Snakefile.tenx", line 113, in __rule_process_one_slice
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pysam/utils.py", line 61, in __call__
File "pysam/libcutils.pyx", line 293, in pysam.libcutils._pysam_dispatch
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 569, in _callback
File "/home/chang/miniconda3/envs/venv/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 555, in cached_or_run
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2364, in run_wrapper
Run command
mgatk tenx -i possorted_bam.bam -b filtered_peak_bc_matrix/barcodes.tsv -bt CB -o mgatk_out -c 48 --snake-stdout --keep-temp-files
I'm running bcall right now, but it's definitely taking much longer than usual.
Ok here is the bcall error:
Traceback (most recent call last):
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/mgatk/bin/python/oneSample.py", line 82, in <module> pysam.index(outputbam)
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pysam/utils.py", line 61, in __call__
save_stdout=kwargs.get("save_stdout", None))
File "pysam/libcutils.pyx", line 293, in pysam.libcutils._pysam_dispatch
OSError: No such file or directory: 'mgatk_out/temp/ready_bam/ACGCCACGTTGTGGTT-1.qc.bam'
MissingOutputException in line 21 of /home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/mgatk/bin/snake
/Snakefile.Scatter:
Job Missing files after 5 seconds:
mgatk_out/temp/ready_bam/ACGCCACGTTGTGGTT-1.qc.bam
mgatk_out/temp/ready_bam/ACGCCACGTTGTGGTT-1.qc.bam.bai
mgatk_out/qc/depth/ACGCCACGTTGTGGTT-1.depth.txt
mgatk_out/temp/sparse_matrices/ACGCCACGTTGTGGTT-1.A.txt
mgatk_out/temp/sparse_matrices/ACGCCACGTTGTGGTT-1.C.txt
mgatk_out/temp/sparse_matrices/ACGCCACGTTGTGGTT-1.G.txt
mgatk_out/temp/sparse_matrices/ACGCCACGTTGTGGTT-1.T.txt
mgatk_out/temp/sparse_matrices/ACGCCACGTTGTGGTT-1.coverage.txt
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 0 completed successfully, but some output files are missing. 0
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 584, in handle_job_success
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 259, in handle_job_success
Exiting because a job execution failed. Look above for error message
Seems like there was similar issue earlier, but I don't think it's the same in regards to the duplicate marking step. I have Java 1.8 installed.
So bcall
will take much longer in general than tenx
mode. If you want to trouble shoot, just supply a shorter list of barcodes (e.g. head filtered_peak_bc_matrix/barcodes.tsv > short.tsv
; then supply -b short.tsv
Anywho, I'm not sure that I see a consistent error here compared to what it previously was. You verified java, which is great. Can you send ls -lR
of the mgatk output folder?
just to verify @cnk113 did the pipeline work on the test data? It looks liek the per-sample processing just stopped randomly at some point in the cell processing, which is extremely unusual. I don't think that I've seen this behavior before...
Oops, that was my old one (I also forcefully killed it once it started failing), here's the one with a few bcs (this I let it run until it fizzled out). Log.txt
I am quite late here but my error was solved by doing snake stdout at the end of the command
I ran it with snake stdout. I've previously solved an earlier issue like that. This is new.
So I've been trying to debug this, and tried this on multiple different Ubuntu versions and python version/env. From what I understand, everything is fine until filterClipBam which ends up not printing the bam file (I can see it run in htop but almost instantaneously fails). This seems to be an os.system() failure or probably one of the parameters being the issue?
Ok I found the issue:
Exception in thread "main" picard.PicardException: This program requires input that are either coordinate or query sorted (according to the header, or at least ASSUME_SORT_ORDER and the content.)
Found ASSUME_SORT_ORDER=null and header sortorder=unsorted
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:294)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Pretty much it seems the header showed unsorted even though it was sorted through pysam sort. I just added ASSUME_SORTED=true to your picard command and it works.
However I now get a different error towards the end (almost there!)
Traceback (most recent call last):
File "/media/chang/HDD-6/chang/mg/atac/Adult_DSP_MG_multi_atac_cr/outs/mgatk/mgatk/bin/python/variant_calling.py", line 88, in <module> base_coverage_dict = load_mgatk_output(MGATK_OUT_DIR)
File "/media/chang/HDD-6/chang/mg/atac/Adult_DSP_MG_multi_atac_cr/outs/mgatk/mgatk/bin/python/variant_calling.py", line 30, in load_mgatk_output fwd_base_df[missing_pos] = 0 # fill in missing positions
File "/home/chang/miniconda3/envs/venv3.6/lib/python3.6/site-packages/pandas/core/frame.py", line 2935, in __setitem__ self._setitem_array(key, value)
File "/home/chang/miniconda3/envs/venv3.6/lib/python3.6/site-packages/pandas/core/frame.py", line 2966, in _setitem_array key, axis=1, raise_missing=False
File "/home/chang/miniconda3/envs/venv3.6/lib/python3.6/site-packages/pandas/core/indexing.py", line 1553, in _get_listlike_indexer keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
File "/home/chang/miniconda3/envs/venv3.6/lib/python3.6/site-packages/pandas/core/indexing.py", line 1640, in _validate_read_indexer raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Int64Index([ 1, 3, 4, 6, 8, 9, 10, 11, 12,\n 14,\n ...\n 16551, 16555, 16558, 16560, 16562, 16563, 16565, 16566, 16568,\n 16569],\n dtype='int64', length=11090)] are in the [columns]"
MissingOutputException in line 163 of /media/chang/HDD-6/chang/mg/atac/Adult_DSP_MG_multi_atac_cr/outs/mgatk/mgatk/bin/snake/Snakefile.tenx:
Job Missing files after 5 seconds:
mgatk_out/final/mgatk.variant_stats.tsv.gz
mgatk_out/final/mgatk.cell_heteroplasmic_df.tsv.gz
mgatk_out/final/mgatk.vmr_strand_plot.png
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 0 completed successfully, but some output files are missing. 0
File "/home/chang/miniconda3/envs/venv3.6/lib/python3.6/site-packages/snakemake/executors/__init__.py", line 589, in handle_job_success
File "/home/chang/miniconda3/envs/venv3.6/lib/python3.6/site-packages/snakemake/executors/__init__.py", line 259, in handle_job_success
Ok so the latest error seems to be sample specific, I'm assuming no variant was confidently detected or there wasn't enough MT coverage.
I just wanted to add that I get the same error with failing to find the bam file.
Implemented this fix in v0.6.3-- hopefully this solves things, but please comment on thread if not!
Hi @caleblareau - thanks for making this great tool. I am encountering the same issue using v0.6.4
where the files in mgatk_out/temp/
seem to be called after they are deleted. I read through this thread but the implemented solution does not seem to carry over for me - please excuse me if I am missing a key detail.
Using:
3.7.12
4.1.2
cellranger-atac-2.0.0
mgatk version 0.6.4
I've included all of the outputs of mgatk
below - I am happy to supply any other potentially useful information.
Command:
mgatk tenx --input outs/possorted_bam.bam -bt CB -b outs/filtered_peak_bc_matrix/mini.tsv --mito-genome hg38 --ncores=14
Error message
find ./mgatk_out/
produces the following file structure:
cat base.mgatk.log
produces:
*` cat filterlogs/`** produces:
cat mgatk.parameters.txt
produces:
cat mgatk.snakemake_tenx.log
produces nothing.
Probably unnecessary but just in case, cat mgatk_out/final/chrM_refAllele.txt | head -10
produces:
I think that's everything - I will keep trouble-shooting in the meantime and provide an update if I find a solution. Thanks in advance for your time and any potential help!
Can you try with --snake-stdout
?
Can you try with
--snake-stdout
?
Ah this seems to clear up the issue.
I saw it mentioned above
I ran it with snake stdout. I've previously solved an earlier issue like that. This is new.
but thought this would be a separate case. I'm not sure why the snakemake workflow requires that but it's enough for me. I am currently testing the outputs in downstream workflows. Thanks for your help, Caleb!
I have no idea either but glad it's working now :)
Hi,
I am running mgatk tenx as advised in the documentation:
mgatk tenx -i $folder_bam/possorted_bam.bam -n CRA_test1 -o CRA_test1_mgatk -c 12 -bt CB -b $folder_bam/filtered_peak_bc_matrix/barcodes.tsv
I get the following error:
Error in checkGrep(grep(".A.txt", files)) : Improper folder specification; file missing / extra file present. See documentation Calls: importMito -> checkGrep Execution halted (myenv)jovyan@jupyter-bio-2druxandra-2dtesloianu:~$ [E::hts_open_format] Failed to open file "CRA_test1_mgatk/temp/temp_bam/barcodes.8.temp0.bam" : No such file or directory [Sat Nov 7 14:38:19 2020] Error in rule process_one_slice: jobid: 0 output: CRA_test1_mgatk/qc/depth/barcodes.8.depth.txt, CRA_test1_mgatk/temp/sparse_matrices/barcodes.8.A.txt, CRA_test1_mgatk/temp/sparse_matrices/barcodes.8.C.txt, CRA_test1_mgatk/temp/sparse_matrices/barcodes.8.G.txt, CRA_test1_mgatk/temp/sparse_matrices/barcodes.8.T.txt, CRA_test1_mgatk/temp/sparse_matrices/barcodes.8.coverage.txt
RuleException: SamtoolsError in line 93 of /home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/mgatk/bin/snake/Snakefile.tenx: 'samtools returned with error 1: stdout=, stderr=samtools sort: can\'t open "CRA_test1_mgatk/temp/temp_bam/barcodes.8.temp0.bam": No such file or directory\n' File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/snakemake/executors/init.py", line 2252, in run_wrapper File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/mgatk/bin/snake/Snakefile.tenx", line 93, in rule_process_one_slice File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/pysam/utils.py", line 75, in call File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/snakemake/executors/init.py", line 560, in _callback File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/snakemake/executors/init.py", line 546, in cached_or_run File "/home/jovyan/my-conda-envs/myenv/lib/python3.7/site-packages/snakemake/executors/init__.py", line 2264, in run_wrapper Exiting because a job execution failed. Look above for error message [E::hts_open_format] Failed to open file "CRA_test1_mgatk/temp/temp_bam/barcodes.4.temp0.bam" : No such file or directory
When I do ls -lR on the CRA_test1_mgatk directory, it seems like it completely lacks the temp directory
(myenv)jovyan@jupyter-bio-2druxandra-2dtesloianu:~/CRA_test1_mgatk$ ls -lR .: total 12 drwxr-sr-x 2 jovyan users 4096 Nov 7 13:05 final drwxr-sr-x 4 jovyan users 4096 Nov 7 13:11 logs drwxr-sr-x 3 jovyan users 4096 Nov 7 14:38 qc
./final: total 120 -rw-r--r-- 1 jovyan users 121446 Nov 7 14:31 chrM_refAllele.txt
./logs: total 16 -rw-r--r-- 1 jovyan users 1392 Nov 7 14:37 base.mgatk.log -rw-r--r-- 1 jovyan users 602 Nov 7 14:37 CRA_test1.parameters.txt -rw-r--r-- 1 jovyan users 0 Nov 7 14:37 CRA_test1.snakemake_tenx.log drwxr-sr-x 2 jovyan users 4096 Nov 7 14:26 filterlogs drwxr-sr-x 2 jovyan users 4096 Nov 7 13:11 rmdupslogs
./logs/filterlogs: total 96 -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.10.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.11.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.12.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:25 barcodes.13.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.14.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.15.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.16.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.17.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:25 barcodes.18.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:25 barcodes.19.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.1.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:25 barcodes.20.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.21.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.22.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.23.filter.log -rw-r--r-- 1 jovyan users 24 Nov 7 14:26 barcodes.24.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.2.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.3.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.4.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.5.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.6.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.7.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.8.filter.log -rw-r--r-- 1 jovyan users 26 Nov 7 14:38 barcodes.9.filter.log
./logs/rmdupslogs: total 0
./qc: total 4 drwxr-sr-x 2 jovyan users 4096 Nov 7 13:11 quality
./qc/quality: total 0