velocyto-team / velocyto.py

RNA velocity estimation in Python
http://velocyto.org/velocyto.py/
BSD 2-Clause "Simplified" License
159 stars 83 forks source link

Could not find index file but still generated the loom from bam -- run10x error? #321

Open denvercal1234GitHub opened 2 years ago

denvercal1234GitHub commented 2 years ago

After so many attempts (solved in #320), I finally generated a loom file from my count output from cellranger 10x. Nevertheless, when I examined the error log, it said "Could not retrieve index file for the cellsorted_possorted_genome_bam.bam."

Would anyone mind helping me understand how it still generated the loom file but having the error of not having found the index file (bam.bai)?

Note that I have the original bam.bai file of the possorted_.bam file (that came from the cellranger count) together with the cellsorted_possorted...bam file (that I manually samtools sorted). I did not generate new index file for my manually sorted bam file.

Screen Shot 2021-11-20 at 1 19 24 PM
denvercal1234GitHub commented 2 years ago

According to https://github.com/pysam-developers/pysam/issues/939:

"This is a diagnostic that is new (in these circumstances) in HTSlib 1.10. In the context in which pysam is calling it it is just a warning and can be ignored — the pysam.AlignmentFile is constructed appropriately regardless."

But can anyone confirm?

JerryZhang-1222 commented 2 years ago

Hello, I'm facing the exactly the same warning.

[E::idx_find_and_load] Could not retrieve index file for '/home/***/FC_IL_1.bam' 2022-03-16 07:11:40,926 - DEBUG - End of file. Reset index: start scanning from initial position.

I think it may be time-consuming without the "index file", because Velocyto have to scanning the whole bam file from the beginning. However, my bam file is around 40G while the loom file is just 200M, I don't know if it's the right result.

denvercal1234GitHub commented 2 years ago

No idea either lol. Please would you mind letting us know here if you find out something about this? Thanks @JerryZhang-1222

JFanbio commented 2 years ago

same error too...

JerryZhang-1222 commented 2 years ago

No idea either lol. Please would you mind letting us know here if you find out something about this? Thanks @JerryZhang-1222

Hello, I ignored this warning and finished the whole pipeline with these loom files, lol. The rest process is okay with me. I will see if it works using other samples.

kiddo18 commented 1 year ago

Hi I also have the same error! Any follow-up for this? Could not retrieve index file for '/path/to/cellranger/outs/cellsorted_possorted_genome_bam.bam'

denvercal1234GitHub commented 1 year ago

@gioelelm -- Do you have any insight/advice on this?

Summary: I was running velocyto run10x -m /ceph/project/....../mm10_rmsk_chrMTGTF.gtf /ceph/project/........../10x_scRNA_704-187_CD8P_GEX /ceph/project/borrowlab/shared/scRNASeq_blood_220210/xenon/Blood_CD8T_TranscriptomeTCRseq/Blood_CD8T_scTranscriptome/run_cellranger_count_chrMTGTF/refdata-gex-GRCh38-2020-A_chrMTGTF/genes/genes.gtf

The /ceph/project/........../10x_scRNA_704-187_CD8P_GEX points to the folder 10x_scRNA_704-187_CD8P_GEX that contains my outs folder, which contains the possorted_genome_bam.bam that I had ran samtools on to sort the cells samtools sort -t CB -O BAM -o cellsorted_possorted_genome_bam.bam /ceph/project/......./10x_scRNA_704-187_CD8P_GEX/outs/possorted_genome_bam.bam, so in my outs folder now has 3 files:

  1. possorted_genome_bam.bam.bai (original from cellranger counts)
  2. possorted_genome_bam.bam (original from cellranger counts)
  3. cellsorted_possorted_genome_bam.bam (result of samtools)

The command ran successfully, but when examined the error log, it shows a Warning and and Error:

 @jit
/package/python-cbrg/current/3.11/lib/python3.11/site-packages/loompy/bus_file.py:101: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  @jit
[W::hts_idx_load3] The index file is older than the data file: /ceph/project........./10x_scRNA_713_CD8_GEX/outs/possorted_genome_bam.bam.bai
**[W::hts_idx_load3] The index file is older than the data file**: /ceph/project/............/10x_scRNA_713_CD8_GEX/outs/possorted_genome_bam.bam.bai
**[E::idx_find_and_load] Could not retrieve index file** for '/ceph/project/..../10x_scRNA_713_CD8_GEX/outs/cellsorted_possorted_genome_bam.bam'

The output log shows:

Loading python-cbrg/current
  Loading requirement: python-base/3.11.3
2023-09-20 16:50:46,388 - DEBUG - Using logic: Default
2023-09-20 16:50:46,411 - INFO - Read 2810 cell barcodes from /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/outs/filtered_feature_bc_matrix/barcodes.tsv.gz
2023-09-20 16:50:46,411 - DEBUG - Example of barcode: AAACCTGCACCGAATT and cell_id: 10x_scRNA_713_CD8_GEX:AAACCTGCACCGAATT-1
2023-09-20 16:50:46,414 - DEBUG - Peeking into /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/outs/possorted_genome_bam.bam
2023-09-20 16:50:46,575 - WARNING - Not found cell and umi barcode in entry 30 of the bam file
2023-09-20 16:50:46,576 - WARNING - Not found cell and umi barcode in entry 32 of the bam file
2023-09-20 16:50:46,576 - WARNING - Not found cell and umi barcode in entry 33 of the bam file
2023-09-20 16:50:46,576 - WARNING - Not found cell and umi barcode in entry 34 of the bam file
2023-09-20 16:50:46,576 - WARNING - Not found cell and umi barcode in entry 35 of the bam file
2023-09-20 16:50:46,576 - WARNING - Not found cell and umi barcode in entry 36 of the bam file
......
2023-09-20 16:50:46,591 - WARNING - The file /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/outs/cellsorted_possorted_genome_bam.bam already exists. The sorting step will be skipped and the existing file will be used.
2023-09-20 16:50:46,591 - INFO - Load the annotation from /ceph/project/borrowlab/shared/scRNASeq_blood_220210/xenon/Blood_CD8T_TranscriptomeTCRseq/Blood_CD8T_scTranscriptome/run_cellranger_count_chrMTGTF/refdata-gex-GRCh38-2020-A_chrMTGTF/genes/genes.gtf
2023-09-20 16:50:57,128 - DEBUG - Parsing Chromosome GL000009.2 strand - [line 0]
2023-09-20 16:50:57,128 - DEBUG - Done with GL000009.2- [line 7]
2023-09-20 16:50:57,128 - DEBUG - Assigning indexes to genes
2023-09-20 16:50:57,128 - DEBUG - Seen 1 genes until now
2023-09-20 16:50:57,128 - DEBUG - Parsing Chromosome GL000194.1 strand - [line 8]
2023-09-20 16:50:57,128 - DEBUG - Done with GL000194.1- [line 33]
2023-09-20 16:50:57,128 - DEBUG - Assigning indexes to genes
2023-09-20 16:50:57,128 - DEBUG - Seen 3 genes until now
2023-09-20 16:50:57,129 - DEBUG - Parsing Chromosome GL000195.1 strand - [line 34]
2023-09-20 16:50:57,129 - DEBUG - Done with GL000195.1- [line 43]
2023-09-20 16:50:57,129 - DEBUG - Assigning indexes to genes

2023-09-20 16:51:02,890 - DEBUG - Done with 18+ [line 1269821]
2023-09-20 16:51:02,890 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:02,891 - DEBUG - Seen 16777 genes until now

2023-09-20 16:51:04,620 - DEBUG - Done with 20+ [line 1707817]
2023-09-20 16:51:04,620 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:04,623 - DEBUG - Seen 22309 genes until now
2023-09-20 16:51:04,623 - DEBUG - Parsing Chromosome 21 strand - [line 1707818]
2023-09-20 16:51:04,664 - DEBUG - Done with 21- [line 1722798]
2023-09-20 16:51:04,664 - DEBUG - Assigning indexes to genes
.......
2023-09-20 16:51:06,608 - DEBUG - Seen 29002 genes until now
2023-09-20 16:51:06,608 - DEBUG - Parsing Chromosome 6 strand - [line 2218658]
2023-09-20 16:51:06,781 - DEBUG - Done with 6- [line 2284937]
2023-09-20 16:51:06,781 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:06,784 - DEBUG - Seen 29928 genes until now
2023-09-20 16:51:06,784 - DEBUG - Parsing Chromosome 6 strand + [line 2284938]
2023-09-20 16:51:06,939 - DEBUG - Done with 6+ [line 2345240]
2023-09-20 16:51:06,939 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:06,942 - DEBUG - Seen 30829 genes until now
2023-09-20 16:51:06,942 - DEBUG - Parsing Chromosome 7 strand - [line 2345241]
.......
2023-09-20 16:51:08,200 - DEBUG - Done with 8+ [line 2577021]
2023-09-20 16:51:08,200 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:08,202 - DEBUG - Seen 34010 genes until now
2023-09-20 16:51:08,202 - DEBUG - Parsing Chromosome 9 strand - [line 2577022]
2023-09-20 16:51:08,321 - DEBUG - Done with 9- [line 2622811]
2023-09-20 16:51:08,321 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:08,323 - DEBUG - Seen 34681 genes until now
2023-09-20 16:51:08,323 - DEBUG - Parsing Chromosome 9 strand + [line 2622812]
2023-09-20 16:51:08,451 - DEBUG - Done with 9+ [line 2672374]
2023-09-20 16:51:08,452 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:08,454 - DEBUG - Seen 35329 genes until now
2023-09-20 16:51:08,454 - DEBUG - Parsing Chromosome MT strand - [line 2672375]
2023-09-20 16:51:08,454 - DEBUG - Done with MT- [line 2672379]
.........
2023-09-20 16:51:08,677 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:08,678 - DEBUG - Seen 36490 genes until now
2023-09-20 16:51:08,678 - DEBUG - Parsing Chromosome Y strand - [line 2760501]
2023-09-20 16:51:08,685 - DEBUG - Done with Y- [line 2763031]
2023-09-20 16:51:08,685 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:08,685 - DEBUG - Seen 36543 genes until now
2023-09-20 16:51:08,685 - DEBUG - Parsing Chromosome Y strand + [line 2763032]
2023-09-20 16:51:08,693 - DEBUG - Assigning indexes to genes
2023-09-20 16:51:08,693 - DEBUG - Done with Y+ [line 2765967]
2023-09-20 16:51:08,693 - DEBUG - Fixing corner cases of transcript models containg intron longer than 1000Kbp
2023-09-20 16:51:10,220 - DEBUG - Generated 2411536 features corresponding to 199138 transcript models from /ceph/project/borrowlab/shared/scRNASeq_blood_220210/xenon/Blood_CD8T_TranscriptomeTCRseq/Blood_CD8T_scTranscriptome/run_cellranger_count_chrMTGTF/refdata-gex-GRCh38-2020-A_chrMTGTF/genes/genes.gtf
2023-09-20 16:51:10,248 - INFO - Load the repeat masking annotation from /ceph/project/borrowlab/shared/Velocyto_Loom_RefMaterials_2022Sep19/mm10_rmsk_chrMTGTF.gtf
2023-09-20 16:51:10,248 - DEBUG - Reading /ceph/project/borrowlab/shared/Velocyto_Loom_RefMaterials_2022Sep19/mm10_rmsk_chrMTGTF.gtf, the file will be sorted in memory
2023-09-20 16:51:33,204 - DEBUG - Processed masked annotation .gtf and generated 4545598 intervals to mask!
2023-09-20 16:51:33,533 - INFO - Scan /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/outs/possorted_genome_bam.bam to validate intron intervals
2023-09-20 16:51:34,985 - DEBUG - Reading /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/outs/possorted_genome_bam.bam
2023-09-20 16:51:35,155 - DEBUG - Read first 0 million reads
2023-09-20 16:51:35,156 - DEBUG - Marking up chromosome 1

2023-09-20 16:58:45,322 - DEBUG - Marking up chromosome 14
2023-09-20 16:59:34,414 - DEBUG - Marking up chromosome 15
2023-09-20 16:59:34,774 - DEBUG - Read first 100 million reads
2023-09-20 17:00:27,209 - DEBUG - Read first 110 million reads
2023-09-20 17:00:47,138 - DEBUG - Marking up chromosome 16
2023-09-20 17:01:16,534 - DEBUG - Read first 120 million reads
2023-09-20 17:02:05,895 - DEBUG - Marking up chromosome 17
2023-09-20 17:02:09,278 - DEBUG - Read first 130 million reads
.......
2023-09-20 17:22:15,306 - DEBUG - Marking up chromosome Y
2023-09-20 17:22:17,658 - DEBUG - Marking up chromosome KI270728.1
2023-09-20 17:22:17,659 - DEBUG - Marking up chromosome KI270727.1
2023-09-20 17:22:17,660 - DEBUG - Marking up chromosome KI270442.1
2023-09-20 17:22:17,660 - WARNING - The .bam file refers to a chromosome 'KI270442.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,660 - WARNING - The .bam file refers to a chromosome 'KI270442.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:17,661 - DEBUG - Marking up chromosome KI270729.1
2023-09-20 17:22:17,661 - WARNING - The .bam file refers to a chromosome 'KI270729.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,661 - WARNING - The .bam file refers to a chromosome 'KI270729.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:17,662 - DEBUG - Marking up chromosome KI270743.1
2023-09-20 17:22:17,662 - WARNING - The .bam file refers to a chromosome 'KI270743.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,662 - WARNING - The .bam file refers to a chromosome 'KI270743.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:17,663 - DEBUG - Marking up chromosome GL000009.2
2023-09-20 17:22:17,663 - WARNING - The .bam file refers to a chromosome 'GL000009.2+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,665 - DEBUG - Marking up chromosome KI270722.1
2023-09-20 17:22:17,665 - WARNING - The .bam file refers to a chromosome 'KI270722.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,665 - WARNING - The .bam file refers to a chromosome 'KI270722.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:17,665 - DEBUG - Marking up chromosome GL000194.1
2023-09-20 17:22:17,665 - WARNING - The .bam file refers to a chromosome 'GL000194.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,668 - DEBUG - Marking up chromosome KI270742.1
2023-09-20 17:22:17,669 - WARNING - The .bam file refers to a chromosome 'KI270742.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,669 - WARNING - The .bam file refers to a chromosome 'KI270742.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:17,670 - DEBUG - Marking up chromosome GL000205.2
2023-09-20 17:22:17,670 - WARNING - The .bam file refers to a chromosome 'GL000205.2+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,685 - DEBUG - Marking up chromosome GL000195.1
2023-09-20 17:22:17,689 - DEBUG - Marking up chromosome KI270736.1
2023-09-20 17:22:17,689 - WARNING - The .bam file refers to a chromosome 'KI270736.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,689 - WARNING - The .bam file refers to a chromosome 'KI270736.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:17,690 - DEBUG - Marking up chromosome KI270733.1
2023-09-20 17:22:17,690 - WARNING - The .bam file refers to a chromosome 'KI270733.1+' not present in the annotation (.gtf) file
2023-09-20 17:22:17,690 - WARNING - The .bam file refers to a chromosome 'KI270733.1-' not present in the annotation (.gtf) file
2023-09-20 17:22:25,426 - DEBUG - Read first 440 million reads
2023-09-20 17:22:38,414 - DEBUG - Read first 450 million reads
2023-09-20 17:22:51,413 - DEBUG - Read first 460 million reads
2023-09-20 17:23:04,579 - DEBUG - Read first 470 million reads
2023-09-20 17:23:10,803 - DEBUG - Marking up chromosome GL000224.1
2023-09-20 17:23:10,803 - WARNING - The .bam file refers to a chromosome 'GL000224.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:10,803 - WARNING - The .bam file refers to a chromosome 'GL000224.1-' not present in the annotation (.gtf) file
2023-09-20 17:23:10,804 - DEBUG - Marking up chromosome GL000219.1
2023-09-20 17:23:10,804 - WARNING - The .bam file refers to a chromosome 'GL000219.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,055 - DEBUG - Marking up chromosome KI270719.1
2023-09-20 17:23:11,055 - WARNING - The .bam file refers to a chromosome 'KI270719.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,055 - WARNING - The .bam file refers to a chromosome 'KI270719.1-' not present in the annotation (.gtf) file
2023-09-20 17:23:11,061 - DEBUG - Marking up chromosome GL000216.2
2023-09-20 17:23:11,061 - WARNING - The .bam file refers to a chromosome 'GL000216.2+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,061 - WARNING - The .bam file refers to a chromosome 'GL000216.2-' not present in the annotation (.gtf) file
2023-09-20 17:23:11,061 - DEBUG - Marking up chromosome KI270712.1
2023-09-20 17:23:11,061 - WARNING - The .bam file refers to a chromosome 'KI270712.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,061 - WARNING - The .bam file refers to a chromosome 'KI270712.1-' not present in the annotation (.gtf) file
2023-09-20 17:23:11,063 - DEBUG - Marking up chromosome KI270706.1
2023-09-20 17:23:11,063 - WARNING - The .bam file refers to a chromosome 'KI270706.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,063 - WARNING - The .bam file refers to a chromosome 'KI270706.1-' not present in the annotation (.gtf) file
2023-09-20 17:23:11,064 - DEBUG - Marking up chromosome KI270725.1
2023-09-20 17:23:11,064 - WARNING - The .bam file refers to a chromosome 'KI270725.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,065 - WARNING - The .bam file refers to a chromosome 'KI270725.1-' not present in the annotation (.gtf) file
2023-09-20 17:23:11,066 - DEBUG - Marking up chromosome KI270734.1
2023-09-20 17:23:11,075 - DEBUG - Marking up chromosome GL000220.1
2023-09-20 17:23:11,075 - WARNING - The .bam file refers to a chromosome 'GL000220.1+' not present in the annotation (.gtf) file
2023-09-20 17:23:11,075 - WARNING - The .bam file refers to a chromosome 'GL000220.1-' not present in the annotation (.gtf) file
2023-09-20 17:23:17,853 - DEBUG - Read first 480 million reads
2023-09-20 17:23:30,813 - DEBUG - Read first 490 million reads
2023-09-20 17:23:46,195 - DEBUG - Read first 500 million reads
2023-09-20 17:23:59,147 - DEBUG - Read first 510 million reads
2023-09-20 17:24:08,399 - DEBUG - Marking up chromosome GL000218.1
2023-09-20 17:24:08,399 - WARNING - The .bam file refers to a chromosome 'GL000218.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,402 - DEBUG - Marking up chromosome KI270749.1
2023-09-20 17:24:08,402 - WARNING - The .bam file refers to a chromosome 'KI270749.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,402 - WARNING - The .bam file refers to a chromosome 'KI270749.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,403 - DEBUG - Marking up chromosome KI270741.1
2023-09-20 17:24:08,403 - WARNING - The .bam file refers to a chromosome 'KI270741.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,403 - WARNING - The .bam file refers to a chromosome 'KI270741.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,403 - DEBUG - Marking up chromosome GL000221.1
2023-09-20 17:24:08,403 - WARNING - The .bam file refers to a chromosome 'GL000221.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,404 - WARNING - The .bam file refers to a chromosome 'GL000221.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,405 - DEBUG - Marking up chromosome KI270519.1
2023-09-20 17:24:08,405 - WARNING - The .bam file refers to a chromosome 'KI270519.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,405 - WARNING - The .bam file refers to a chromosome 'KI270519.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,406 - DEBUG - Marking up chromosome KI270708.1
2023-09-20 17:24:08,406 - WARNING - The .bam file refers to a chromosome 'KI270708.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,406 - WARNING - The .bam file refers to a chromosome 'KI270708.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,406 - DEBUG - Marking up chromosome KI270438.1
2023-09-20 17:24:08,406 - WARNING - The .bam file refers to a chromosome 'KI270438.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,406 - WARNING - The .bam file refers to a chromosome 'KI270438.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,461 - DEBUG - Marking up chromosome KI270738.1
2023-09-20 17:24:08,461 - WARNING - The .bam file refers to a chromosome 'KI270738.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,461 - WARNING - The .bam file refers to a chromosome 'KI270738.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,465 - DEBUG - Marking up chromosome KI270711.1
2023-09-20 17:24:08,465 - WARNING - The .bam file refers to a chromosome 'KI270711.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,468 - DEBUG - Marking up chromosome KI270745.1
2023-09-20 17:24:08,468 - WARNING - The .bam file refers to a chromosome 'KI270745.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,468 - WARNING - The .bam file refers to a chromosome 'KI270745.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,468 - DEBUG - Marking up chromosome KI270713.1
2023-09-20 17:24:08,477 - DEBUG - Marking up chromosome KI270718.1
2023-09-20 17:24:08,477 - WARNING - The .bam file refers to a chromosome 'KI270718.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,477 - WARNING - The .bam file refers to a chromosome 'KI270718.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,478 - DEBUG - Marking up chromosome KI270707.1
2023-09-20 17:24:08,478 - WARNING - The .bam file refers to a chromosome 'KI270707.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,478 - WARNING - The .bam file refers to a chromosome 'KI270707.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,478 - DEBUG - Marking up chromosome KI270315.1
2023-09-20 17:24:08,478 - WARNING - The .bam file refers to a chromosome 'KI270315.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,478 - WARNING - The .bam file refers to a chromosome 'KI270315.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:08,478 - DEBUG - Marking up chromosome KI270539.1
2023-09-20 17:24:08,478 - WARNING - The .bam file refers to a chromosome 'KI270539.1+' not present in the annotation (.gtf) file
2023-09-20 17:24:08,478 - WARNING - The .bam file refers to a chromosome 'KI270539.1-' not present in the annotation (.gtf) file
2023-09-20 17:24:12,873 - DEBUG - Read first 520 million reads
2023-09-20 17:24:24,338 - DEBUG - Read first 530 million reads
2023-09-20 17:24:35,867 - DEBUG - Read first 540 million reads
2023-09-20 17:24:37,739 - DEBUG - End of file. Reset index: start scanning from initial position.
2023-09-20 17:24:37,740 - DEBUG - 11480191 reads were skipped because no apropiate cell or umi barcode was found
2023-09-20 17:24:37,740 - DEBUG - Start molecule counting!
2023-09-20 17:24:39,808 - DEBUG - Features available for chromosomes : ['GL000009.2-', 'GL000194.1-', 'GL000195.1-', 'GL000195.1+', 'GL000205.2-', 'GL000213.1-', 'GL000218.1-', 'GL000219.1-', 'KI270711.1-', 'KI270713.1-', 'KI270713.1+', 'KI270721.1+', 'KI270726.1+', 'KI270727.1-', 'KI270727.1+', 'KI270728.1-', 'KI270728.1+', 'KI270731.1-', 'KI270734.1-', 'KI270734.1+', '1-', '1+', '10-', '10+', '11-', '11+', '12-', '12+', '13-', '13+', '14-', '14+', '15-', '15+', '16-', '16+', '17-', '17+', '18-', '18+', '19-', '19+', '2-', '2+', '20-', '20+', '21-', '21+', '22-', '22+', '3-', '3+', '4-', '4+', '5-', '5+', '6-', '6+', '7-', '7+', '8-', '8+', '9-', '9+', 'MT-', 'MT+', 'X-', 'X+', 'Y-', 'Y+']
2023-09-20 17:24:39,808 - DEBUG - Mask available for chromosomes : ['1-', '1+', '10-', '10+', '10_GL383545v1_alt-', '10_GL383545v1_alt+', '10_GL383546v1_alt-', '10_GL383546v1_alt+', '10_KI270824v1_alt-', '10_KI270824v1_alt+', '10_KI270825v1_alt-', '10_KI270825v1_alt+', '10_KN196480v1_fix-', '10_KN196480v1_fix+', '10_KN538365v1_fix-', '10_KN538365v1_fix+', .............................'11_KI270927v1_alt-', '11_KI270927v1_alt+', '11_KN196481v1_fix-', '11_KN196481v1_fix+', '19_KI270882v1_alt-', '19_KI270882v1_alt+', '19_KI270883v1_alt-', '19_KI270883v1_alt+', '19_KV575246v1_alt-', '19_KV575246v1_alt+', '19_KV575247v1_alt-', '19_KV575247v1_alt+', '19_KV575248v1_alt-', '19_KV575248v1_alt+', '19_KV575249v1_alt-', '19_KV575249v1_alt+', '19_KV575250v1_alt-', '19_KV575250v1_alt+', '19_KV575251v1_alt-', '19_KV575251v1_alt+', '19_KV575252v1_alt-', '19_KV575252v1_alt+', '19_KV575253v1_alt-', '19_KV575253v1_alt+', '19_KV575254v1_alt-', '19_KV575254v1_alt+', '19_KV575255v1_alt-', '19_KV575255v1_alt+',  'X_ML143383v1_fix+', 'X_ML143384v1_fix-', 'X_ML143384v1_fix+', 'X_ML143385v1_fix-', 'X_ML143385v1_fix+', 'Y-', 'Y+', 'Y_KI270740v1_random-', 'Y_KN196487v1_fix-', 'Y_KN196487v1_fix+', 'Y_KZ208923v1_fix-', 'Y_KZ208923v1_fix+', 'Y_KZ208924v1_fix-']
2023-09-20 17:24:39,808 - DEBUG - Summarizing the results of intron validation.
2023-09-20 17:24:40,066 - DEBUG - Validated 140066 introns (of which unique intervals 47154) out of 1106199 total possible introns (considering each possible transcript models).
2023-09-20 17:24:40,067 - DEBUG - Reading /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/outs/cellsorted_possorted_genome_bam.bam
2023-09-20 17:24:40,152 - DEBUG - Read first 0 million reads
2023-09-20 17:25:07,665 - DEBUG - Read first 10 million reads
2023-09-20 17:25:20,505 - DEBUG - Read first 20 million reads
2023-09-20 17:25:33,001 - DEBUG - Read first 30 million reads
2023-09-20 17:25:55,337 - DEBUG - Read first 40 million reads
2023-09-20 17:26:07,797 - DEBUG - Read first 50 million reads
2023-09-20 17:26:20,033 - DEBUG - Read first 60 million reads
2023-09-20 17:26:31,893 - DEBUG - Read first 70 million reads
2023-09-20 17:27:28,172 - DEBUG - Read first 80 million reads
2023-09-20 17:28:17,924 - DEBUG - Counting for batch 1, containing 100 cells and 10173278 reads
2023-09-20 17:29:17,390 - DEBUG - 77070 reads not considered because fully enclosed in repeat masked regions
............
2023-09-20 18:39:27,542 - DEBUG - 11480191 reads were skipped because no apropiate cell or umi barcode was found
2023-09-20 18:39:27,542 - DEBUG - Counting done!
2023-09-20 18:39:27,553 - DEBUG - Generating output file /ceph/project/borrowlab/shared/HIVBNAb_TotalCD8_scRNAseq/raw_03_12_2021HIVTodd_scRNAseq_10X_HIV_704707713_TotalCD8/HIVTodd_10x_scRNA_102105702704707713_TotalCD8_GEX/10x_scRNA_713_CD8_GEX/velocyto/10x_scRNA_713_CD8_GEX.loom
2023-09-20 18:39:27,553 - DEBUG - Collecting row attributes
2023-09-20 18:39:27,631 - DEBUG - Generating data table
2023-09-20 18:39:28,064 - DEBUG - Writing loom file
2023-09-20 18:39:28,109 - DEBUG - Creating converter from 5 to 3
2023-09-20 18:39:28,110 - DEBUG - Creating converter from 3 to 5
2023-09-20 18:39:30,757 - DEBUG - Terminated Succesfully!
greyson9 commented 11 months ago

Do you need to rename either the .bam or .bam.bai file?

I got this to work after struggling by 1) running samtools sort on my .bam file, then 2) replacing my original .bam file with the sorted one keeping the original filename.

Maybe either make a renamed copy of your .bai (cellsorted_....bam.bai) or rename the .bai file to match the cellsorted .bam and move the old .bam file out of the directory? Hth!