Open kwglam opened 3 years ago
Hi, it looks like something is wrong in BAM files.
Hi Angela, Thanks for the comment. After running zUMIs, 4 bam files are generated. The bam file ending with ".......filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is the only one comes together with .bai file. Is it the correct bam file for running ss3_isoform.py? I have successfully used this file to run stitcher.py, generating a sam file with stitched RNA molecules. Do you know if there is any way to check what problem the bam file has? Thanks!
@kwglam Hi, is this issue solved?
Yes, this issue has been solved. Thanks!
Hi @PingChen-Angela! Thanks for maintaining such a useful package!
Just following on from this thread regarding the inputs of ss3_isoform.py
.. I have run zUMI and would now like to run the isoform matching.
Would filtered.tagged.Aligned.out.bam
from running zUMI be the correct output into -i [path/to/inputBAM]
? I noticed in the above thread the following bam may be required filtered.Aligned.GeneTagged.UBcorrected.sorted.bam
, which I think is the bam output from running zUMI with velocyte run.
Thanks in advance for clarifying.
Best regards, Hani
Hi Hani,
The *.filtered.tagged.Aligned.out.bam
lacks gene assignment and UMI error correction, which are both needed for isoform inference.
The velocyto output from zUMIs has nothing to do with this and is labelled *.tagged.forVelocyto.bam
Hence, please use the *.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam
file.
Best, Christoph
I see, thanks Christoph for the clarification!
Hi cziegenhain, When i use .filtered.Aligned.GeneTagged.UBcorrected.sorted.bam for ss3_isoform.py,i have an error: Preprocessing on input BAM ... [bam_sort_core] merging from 88 files and 8 in-memory blocks... Collect informative reads per gene... ...for genes on chr1 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/data/vip55/miniconda3/envs/zUMIs-env/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(args, *kwds)) File "/home/data/vip55/miniconda3/envs/zUMIs-env/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar return list(map(args)) File "/home/data/vip55/software/Smart-seq3-master/ss3iso/pyModule/informative_reads.py", line 479, in _get_reads report_gene = gobj.get_aligned_reads(n_read_limit, passed_cells) File "/home/data/vip55/software/Smart-seq3-master/ss3iso/pyModule/informative_reads.py", line 84, in get_aligned_reads samfile = pysam.AlignmentFile(self.in_bam_uniq, "rc") File "pysam/libcalignmentfile.pyx", line 741, in pysam.libcalignmentfile.AlignmentFile.cinit File "pysam/libcalignmentfile.pyx", line 990, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file has no sequences defined (mode='rc') - is it SAM/BAM format? Consider opening with check_sq=False """
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/data/vip55/software/Smart-seq3-master/ss3iso/ss3_isoform.py", line 109, in
and my code is: $python /home/data/vip55/software/Smart-seq3-master/ss3iso/ss3_isoform.py -i smartseq3_mouse_fibroblast.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam -e smartseq3_mouse_fibroblast -o ss3 -p 8 -s mm10 -P -R -c ss3_isoform.conf
so, what's wrong?
Best, Anna
Yes, this issue has been solved. Thanks!
Hi Kwglam, How did you solve it?
Hi xucaoling,
I also have the same issue. Do you solve the issue?
Best, Shin
Hi xucaoling,
I also have the same issue. Do you solve the issue?
Best, Shin
Hi shinichiro03, have you solved the issue?
@lamyankin, @xucaoling, and @Shinichiro03, I forgot what exactly the problems were coz I have not used it for quite a long time. My recollection is that you have to stick with the old version of bedtools (bedtoolsv.2.26 or older versions) and that you have to change umi_file_prefix = 'UBfix.sort.bam' into umi_file_prefix = 'UBcorrected.sorted.bam' on line 67 in the ss3_isoform.py script. Hope it works....
After fixing the umi_file_prefix = 'UBcorrected.sorted.bam'
problem, I get the following error! Does this look familiar? I couldn't quite figure out what the problem is!
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/project/ss3iso/pyModule/informative_reads.py", line 468, in _get_reads
gobj.get_exon_coordinates(gene)
File "/project/ss3iso/pyModule/informative_reads.py", line 64, in get_exon_coordinates
gene_id = fds[-1].split(';')[0].split('=')[1]
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/project/ss3iso/ss3_isoform.py", line 112, in <module>
main()
File "/project/ss3iso/ss3_isoform.py", line 102, in main
fetch_gene_reads(in_bam_uniq, in_bam_multi, conf_data, op.species, out_path)
File "/project/ss3iso/pyModule/informative_reads.py", line 550, in fetch_gene_reads
report_genes = pool.map(func, genes, chunksize=1)
File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/conda/lib/python3.9/multiprocessing/pool.py", line 771, in get
raise self._value
IndexError: list index out of range
Hi Angela,
I tried to do the isoform reconstruction by running your ss3_isoform.py script. However, the program halted with the following error messages. Would you please kindly advise what the potential problem is? Thanks!!
Preprocessing on input BAM ... [bam_sort_core] merging from 104 files and 8 in-memory blocks... [main_samview] fail to read the header from "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam". [main_samview] fail to read the header from "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam". [main_samview] fail to read the header from "-". samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/preprocess/210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed Collect informative reads per gene... samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/expression_ensembl/ex_210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed samtools index: "/home/xxx/projects/Smart-seq3/ss3iso_210629/hsa/ss3iso_210629/expression_ensembl/ex_210624_Smartseq3.filtered.Aligned.GeneTagged.UBcorrected.sorted.bam" is in a format that cannot be usefully indexed ...for genes on 1 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar return list(map(args)) File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 479, in _get_reads report_gene = gobj.get_aligned_reads(n_read_limit, passed_cells) File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 84, in get_aligned_reads samfile = pysam.AlignmentFile(self.in_bam_uniq, "rc") File "pysam/libcalignmentfile.pyx", line 742, in pysam.libcalignmentfile.AlignmentFile.cinit File "pysam/libcalignmentfile.pyx", line 947, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file does not contain alignment data """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/ss3_isoform.py", line 109, in
main()
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/ss3_isoform.py", line 99, in main
fetch_gene_reads(in_bam_uniq, in_bam_multi, conf_data, op.species, out_path)
File "/home/xxx/projects/Smart-seq3/ss3iso/Smart-seq3/ss3iso/pyModule/informative_reads.py", line 550, in fetch_gene_reads
report_genes = pool.map(func, genes, chunksize=1)
File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/xxx/anaconda3/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
ValueError: file does not contain alignment data