Closed amdqiao1 closed 1 year ago
Dear user, Thanks for bringing up this issue. It is the first time in my life that I have seen such an error. Could you please check a couple of things?
samtools view bamfile | grep 248849282
and samtools view bamfile | grep 138479
Thanks, Fran
- Do you find the coordinates individually
samtools view bamfile | grep 248849282
andsamtools view bamfile | grep 138479
Yes! I found 248849282, but not 138479. 138479 only appeared as a substring of some greater coordinate.
VH00411:140:AAAYLMTHV:1:2406:24556:16903 1024 chr1 248849282 255 90M * 0 0 CCAGGGCAGATCAAGGGGCCTCTCAGAACCATGTTCCCCAGCCAGGTGAGGACCATTTTCACTGGGACCCAGGCCAAAACCATGTGGGTG CCCCCCCC;CCCCCCCCCCCCCCCCCCC-CCCCC-CCCCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCC NH:i:1 HI:i:1 AS:i:88 nM:i:0 RG:Z:results:0:1:AAAYLMTHV:1 TX:Z:ENST00000306562,+2809,90M GX:Z:ENSG00000171161 GN:Z:ZNF672 fx:Z:ENSG00000171161 RE:A:E xf:i:17 CR:Z:TTGCAAGGTCATGCCC CY:Z:CCCCCCCCCC-CCCCC CB:Z:TTGCAAGGTCATGCCC-1UR:Z:ATGCATTCTATA UY:Z:CCCCCCCCCCCC UB:Z:ATGCATTCTATA
VH00411:140:AAAYLMTHV:2:1507:10562:34983 1024 chr1 248849282 255 90M * 0 0 CCAGGGCAGATCAAGGGGCCTCTCAGAACCATGTTCCCCAGCCAGGTGAGGACCATTTTCACTGGGACCCAGGCCAAAACCATGTGGGTG CCCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC NH:i:1 HI:i:1 AS:i:88 nM:i:0 RG:Z:results:0:1:AAAYLMTHV:2 TX:Z:ENST00000306562,+2809,90M GX:Z:ENSG00000171161 GN:Z:ZNF672 fx:Z:ENSG00000171161 RE:A:E xf:i:17 CR:Z:GTAAAGCCAATTTGGT CY:Z:CCCCCCCCCCCCCCCC CB:Z:GTAAAGCCAATTTGGT-1UR:Z:TGCAAATAAACA UY:Z:CCCCCCCCC;CC UB:Z:TGCAAATAAACA
- When working with pysam, we usually work with 0-based coordinates. So could you look for the coordinates 248849283 or 138480?
Yes for 248849283, but not 138480. Similarly, 138480 only appeared as a substring.
VH00411:140:AAAYLMTHV:1:1603:33058:30666 1024 chr1 248849283 255 90M * 0 0 CAGGGCAGATCAAGGGGCCTCTCAGAACCATGTTCCCCAGCCAGGTGATGACCATTTTCACTGGGACCCAGGCCAAAACCATGTGGGTGC CCCCCCCCCCCCCCCCCCCCCCCCC;CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC-CCCCCCCCCCCCCCCCCCCC NH:i:1 HI:i:1 AS:i:86 nM:i:1 RG:Z:results:0:1:AAAYLMTHV:1 TX:Z:ENST00000306562,+2810,90M GX:Z:ENSG00000171161 GN:Z:ZNF672 fx:Z:ENSG00000171161 RE:A:E xf:i:17 CR:Z:ATGTAACGTCGTTATC CY:Z:;CCCC;CCCCCCC;CC CB:Z:ATGTAACGTCGTTATC-1UR:Z:ACCACTTTTTTG UY:Z:CCCCCCCCCCCC UB:Z:ACCACTTTTTTG
Could you please provide some more info about the next points?
--infile
?
- Could you send the command you used to run SitesPerCell.py? As well as the complete error that you get.
The command is as suggested in the tutorial:
python $SCOMATIC/scripts/SitesPerCell/SitesPerCell.py --bam $bam \
--infile $infile/${sample}.calling.step1.tsv \
--ref $ref/genome.fa \
--out_folder $outfile --tmp_dir $temp --nprocs 1
The complete error:
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
Traceback (most recent call last):
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 369, in <module>
main()
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 358, in main
collect_result(run_interval(row, DICT_sites[row], BAM, FASTA, MIN_COV, MIN_CC, tmp_dir, MIN_BQ, MIN_MQ))
File "/PHShome/yq038/SComatic-main/scripts/SitesPerCell/SitesPerCell.py", line 131, in run_interval
i = bam.pileup(CHROM, START, END, min_base_quality = BQ, min_mapping_quality = MQ, ignore_overlaps = False)
File "pysam/libcalignmentfile.pyx", line 1326, in pysam.libcalignmentfile.AlignmentFile.pileup
File "pysam/libchtslib.pyx", line 688, in pysam.libchtslib.HTSFile.parse_region
ValueError: invalid coordinates: start (248849282) > stop (138479)
- Could you check if the problematic coordinates are found in the input file you passed to the parameter
--infile
?
Here are the matching results for problematic start and stop coordinates in ${sample}.calling.step1.tsv
. There are other matches of 138479 as substrings in this file.
chr1 248849282 248849282 C . . . TGTCC CAGGG . . . 0;39;1 0;29;1 . . DP|NC|CC|BC|BQ|BCf|BCr NA 39|29|0:29:0:0:0:0|0:39:0:0:0:0|0:1326:0:0:0:0|0:28:0:0:0:0|0:11:0:0:0:0 NA NA NA NA NA NA NA NA NA NA
KI270733.1 138479 138479 C . . . CCTCC TCCCT . . . . 0;7;1 0;5;1 . . DP|NC|CC|BC|BQ|BCf|BCr NA 7|5|0:5:0:0:0:0|0:7:0:0:0:0|0:238:0:0:0:0|0:1:0:0:0:0|0:6:0:0:0:0 NA NA NA NA NA NA NA NA NA NA
I have also faced this issue, strangely enough, I have been able to avoid this by increasing the number of cell types in my metadata.
In my first approach I had only 3 groups, immune, stroma and tumor cells, now I have subdivided the barcodes metadata to the actual cell types in the immune and stroma population as well as subdivide the tumor cells by custom-defined transcriptional programs and the script seems to work perfectly fine. In some of the cell types (extremely small number of cells) it doesn't give results (no barcode or sites) but I suppose I can work without those cell types.
It's interesting that it only fails with a low number of cell types. Actually, it is a complaint from the pysam package. I will take a deeper look at this error.
Very low-represented cell types are expected not to provide results with SComatic. That occurs cause they need to reach the minimum required coverage per site to be considered as callable (5 cells per site by default).
Thanks, Fran
Hi developers,
I got a pysam
ValueError: invalid coordinates: start (248849282) > stop (138479)
when running SitesPerCell.py analysis. Is it an expected behavior for 10x multiome data? I do have sequences aligned to the reverse strand (FLAG=16) in my BAM file. However, the start and stop don't seem to belong to the same entry for 1) it would be too long a sequence to be captured and 2) there is no match after runningsamtools view bamfile | grep 248849282\t138479
. Can someone help me out on this? Are there any possible solutions? My environment is as following. Many thanks!!