Closed kwglam closed 3 years ago
Hi,
Sounds right! *.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam is the final file you'd mostly use (filtered, STAR mapped, coordinate sorted BAM file with BC and corrected UB tags & featureCount Gene assignments (exon or exon+intron))
You should check if the file is of reasonable size (eg. not just a few kb) and you can also check if it is intact with samtools quickcheck
Best, Christoph
Hi Christoph,
Thanks very much for the quick response. I have checked the bam file with samtools quickcheck and nothing popped out. The size of the bam file is 29G. So, I believe the bam file itself is normal. Also, I have used this bam file to run stitcher.py and it did give me a sam file with all stitched RNA molecules. I guess it is only the ss3iso.py gives me the error flags.
Error messages from ss3iso.py: [bam_sort_core] merging from 136 files and 8 in-memory blocks... [main_samview] fail to read the header from "/home/xxx/projects/Smart-seq3/ss3iso_210806/hsa/zUMIs-2.9.4_210806/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam". [main_samview] fail to read the header from "/home/xxx/projects/Smart-seq3/ss3iso_210806/hsa/zUMIs-2.9.4_210806/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam". [main_samview] fail to read the header from "-". samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210806/hsa/zUMIs-2.9.4_210806/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210806/hsa/zUMIs-2.9.4_210806/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed Preprocessing on input BAM ... Collect informative reads per gene... samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210806/hsa/zUMIs-2.9.4_210806/expression_ensembl/ex_210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210806/hsa/zUMIs-2.9.4_210806/expression_ensembl/ex_210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed ...for genes on chr1 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar return list(map(args)) File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 479, in _get_reads report_gene = gobj.get_aligned_reads(n_read_limit, passed_cells) File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 84, in get_aligned_reads samfile = pysam.AlignmentFile(self.in_bam_uniq, "rc") File "pysam/libcalignmentfile.pyx", line 742, in pysam.libcalignmentfile.AlignmentFile.cinit File "pysam/libcalignmentfile.pyx", line 947, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file does not contain alignment data """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/ss3_isoform.py", line 109, in
I know you are not the author of the ss3iso.py script, but any insights or suggestions are appreciated! Thanks!
Cheers, Gabriel
Agree, it must be an issue in the ss3iso.py script. I'm personally not familiar with it, just taking a quick glance at the code the part where your error log breaks is in the few lines of "preprocessing" that it attempts to do: https://github.com/sandberg-lab/Smart-seq3/blob/master/ss3iso/ss3_isoform.py#L73
Since your input bam file is already coordinate sorted, that seems superfluous? Maybe you want to move this discussion to the ss3iso github, but reading the code it seems you should be getting a "preprocess" folder with three files, UBfix.coordinateSorted.bam, UBfix.coordinateSorted_unique.bam and UBfix.coordinateSorted_multi.bam.
Best, Christoph
I have actually started a thread in the ss3iso github. While I am waiting for the author's response, I also look for solutions from other experts.
There is only one empty bam file, which is my input bam but with no content, in the 'preprocess' folder. I guess the program just terminated without generating any of the 3 bam files after failing to read my input bam.
Thanks, Gabriel
I understand! Feel free to reopen if you need any further assistance with zUMIs. Best, Christoph
Hi, kwglam, I met the same eror messages as you. i saw you have solved the problem and closed the issue in the ss3iso github, could you tell me the solution about this issue. I am looking forward to your reply. Lam
Hi,
I would like to know what expected bam outputs are after a successful run of zUMIs. I have got 4 bam files ending with: ".filtered.tagged.Aligned.out.bam", ".filtered.Aligned.GeneTagged.UBcorrected.sorted.bam", ".filtered.tagged.Aligned.toTranscriptome.out.bam", and ".filtered.tagged.unmapped.bam" after running zUMIs. I believe it was a successful run and I have got all other plots in the zUMIs_output folder (Please also see the attached yaml and nohup).
The reason I asked is that I received a complaint of failing to read the header from my ".filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" file when I used ss3iso.py to run the downstream analysis of my Smart-seq3 data. I am wondering if it is the correct bam file as an input or if there is anything wrong with my bam file.
Thanks in advance for any suggestions and advice!!
210806_Smart-seq3_zUMIs.txt nohup.txt