velocyto-team / velocyto.py

RNA velocity estimation in Python
http://velocyto.org/velocyto.py/
BSD 2-Clause "Simplified" License
159 stars 84 forks source link

Multi bam files and cell barcodes #89

Closed vagnec closed 6 years ago

vagnec commented 6 years ago

Hi,

I'm sorry because I think my question is very very naive...

I got an error while running velocyto: logging.debug(f"Example of barcode: {valid_bcs_list[0]} and cell_id: {valid_cellid_list[0]}")

I don't understand what is wrong. I didn't add any cell barcode, as I have one bam file per cell. There is no barcode in the reads of my bam files.

Thank you in advance for your help

gioelelm commented 6 years ago

To be able to answer I need:

vagnec commented 6 years ago

Hi Gioele,

...

2018-07-05 14:14:20,632 - DEBUG - Parsing Chromosome 9 strand + [line 1609847] 2018-07-05 14:14:21,318 - DEBUG - Done with 9+ [line 1664785] 2018-07-05 14:14:21,319 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:21,335 - DEBUG - Seen 48329 genes until now 2018-07-05 14:14:21,335 - DEBUG - Parsing Chromosome M strand - [line 1664786] 2018-07-05 14:14:21,335 - DEBUG - Done with M- [line 1664815] 2018-07-05 14:14:21,335 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:21,335 - DEBUG - Seen 48338 genes until now 2018-07-05 14:14:21,336 - DEBUG - Parsing Chromosome M strand + [line 1664816] 2018-07-05 14:14:21,337 - DEBUG - Done with M+ [line 1664932] 2018-07-05 14:14:21,337 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:21,337 - DEBUG - Seen 48366 genes until now 2018-07-05 14:14:21,337 - DEBUG - Parsing Chromosome X strand - [line 1664933] 2018-07-05 14:14:24,932 - DEBUG - Done with X- [line 1694755] 2018-07-05 14:14:24,933 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:24,947 - DEBUG - Seen 49635 genes until now 2018-07-05 14:14:24,947 - DEBUG - Parsing Chromosome X strand + [line 1694756] 2018-07-05 14:14:25,575 - DEBUG - Done with X+ [line 1727994] 2018-07-05 14:14:25,576 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:25,589 - DEBUG - Seen 50980 genes until now 2018-07-05 14:14:25,589 - DEBUG - Parsing Chromosome Y strand - [line 1727995] 2018-07-05 14:14:25,729 - DEBUG - Done with Y- [line 1734926] 2018-07-05 14:14:25,729 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:25,736 - DEBUG - Seen 51860 genes until now 2018-07-05 14:14:25,736 - DEBUG - Parsing Chromosome Y strand + [line 1734927] 2018-07-05 14:14:25,868 - DEBUG - Assigning indexes to genes 2018-07-05 14:14:25,873 - DEBUG - Done with Y+ [line 1741516] 2018-07-05 14:14:25,874 - DEBUG - Fixing corner cases of transcript models containg intron longer than 1000Kbp 2018-07-05 14:14:31,470 - DEBUG - Generated 1433300 features corresponding to 131100 transcript models from /path/Mus_musculus.GRCm38.90_UCSConlychr.gtf 2018-07-05 14:14:31,507 - INFO - Scan /path_analyses/test_velocyto/LNMA01.dedump.bam.pos.bam.name.bam /path_analyses/test_velocyto/LNMA02.dedump.bam.pos.bam.name.bam to validate intron intervals 2018-07-05 14:14:49,986 - DEBUG - Reading /path_analyses/test_velocyto/LNMA01.dedump.bam.pos.bam.name.bam 2018-07-05 14:14:49,997 - DEBUG - Read first 0 million reads 2018-07-05 14:15:24,374 - DEBUG - End of file. Reset index: start scanning from initial position. 2018-07-05 14:15:24,375 - DEBUG - Reading /path_analyses/test_velocyto/LNMA02.dedump.bam.pos.bam.name.bam 2018-07-05 14:15:24,400 - DEBUG - Read first 0 million reads 2018-07-05 14:15:31,443 - DEBUG - End of file. Reset index: start scanning from initial position. 2018-07-05 14:15:31,443 - DEBUG - 912579 reads were skipped because no apropiate cell or umi barcode was found 2018-07-05 14:15:31,444 - DEBUG - Start molecule counting! 2018-07-05 14:15:37,984 - DEBUG - Features available for chromosomes : ['1-', '1+', '10-', '10+', '11-', '11+', '12-', '12+', '13-', '13+', '14-', '14+', '15-', '15+', '16-', '16+', '17-', '17+', '18-', '18+', '19-', '19+', '2-', '2+', '3-', '3+', '4-', '4+', '5-', '5+', '6-', '6+', '7-', '7+', '8-', '8+', '9-', '9+', 'M-', 'M+', 'X-', 'X+', 'Y-', 'Y+'] 2018-07-05 14:15:37,985 - DEBUG - Mask available for chromosomes : [] 2018-07-05 14:15:37,985 - DEBUG - Summarizing the results of intron validation. 2018-07-05 14:15:38,795 - DEBUG - Validated 0 introns (of which unique intervals 0) out of 651100 total possible introns (considering each possible transcript models). 2018-07-05 14:15:38,795 - DEBUG - Reading /path_analyses/test_velocyto/LNMA01.dedump.bam.pos.bam.name.bam 2018-07-05 14:15:38,798 - DEBUG - Read first 0 million reads 2018-07-05 14:15:55,566 - DEBUG - Counting for batch 1, containing 0 cells and 0 reads 2018-07-05 14:15:55,566 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-05 14:15:55,566 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-05 14:15:55,567 - WARNING - 0 of the barcodes where without cell 2018-07-05 14:15:55,567 - DEBUG - Reading /path_analyses/test_velocyto/LNMA02.dedump.bam.pos.bam.name.bam 2018-07-05 14:15:55,570 - DEBUG - Read first 0 million reads 2018-07-05 14:16:04,264 - DEBUG - Counting for batch 2, containing 0 cells and 0 reads 2018-07-05 14:16:04,267 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-05 14:16:04,267 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-05 14:16:04,268 - WARNING - 0 of the barcodes where without cell 2018-07-05 14:16:04,268 - DEBUG - 912579 reads were skipped because no apropiate cell or umi barcode was found 2018-07-05 14:16:04,268 - DEBUG - Counting done! /path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/matplotlib/font_manager.py:278: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment. 'Matplotlib is building the font cache using fc-list. ' Traceback (most recent call last): File "/path_analyses/software/miniconda/envs/velocyto/bin/velocyto", line 11, in load_entry_point('velocyto==0.17.8', 'console_scripts', 'velocyto')() File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/run.py", line 113, in run samtools_memory=samtools_memory, dump=dump, verbose=verbose, additional_ca=additional_ca) File "/path_analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/_run.py", line 236, in _run logging.debug(f"Example of barcode: {valid_bcs_list[0]} and cell_id: {valid_cellid_list[0]}") IndexError: list index out of range



Thank you in advance for your help
gioelelm commented 6 years ago

Thank you for the extra information.

The problem is somehow related with the content and format of the bam file as highlighted by the following lines in the log:

Counting for batch 1, containing 0 cells and 0 reads
Counting for batch 2, containing 0 cells and 0 reads
....
912579 reads were skipped because no appropriate cell or umi barcode was found

Can I ask which techniques is this? It looks like it might be SmartSeq2 (or any other technique that does not have UMIs). If this is the case you should either use the run_smartseq2 command or specify the option without_umi

Hope this helps. Let me know if this does not solve the issue and don't hesitate to get in contact again if you encounter other problems

vagnec commented 6 years ago

The used technique was "BD™ Precise Whole-Transcriptome Analysis", which is not marketed anymore . It has UMIs.

Actually, I think I found my mistake. I used "UMI tools". With this tool, UMIs are added to the name of the read, not in a tag.

Example: K00201:191:HNVFNBBXX:1:1225:27600:17034_GTGAGTGA 272 chr1 3329451 1 48M12S * 0 0 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATCACTCACAGCACGGTTA A<-FF<FAAA7-7--J<F-JFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFAA NH:i:3 HI:i:3 NM:i:1 MD:Z:45G2 AS:i:45 nM:i:1 jM:B:c,-1 jI:B:i,-1

I just saw in the tutorial:

[The bam file will have to] contain an error corrected molecular barcodes as a TAG named UB or XM.

So I will correct that and retry.

Thank you very much for your help!

vagnec commented 6 years ago

Hi,

I put the UMI in the XM flag. Here is how my bam files look:

K00201:191:HNVFNBBXX:1:1225:27600:17034_GTGAGTGA        272     chr1    3329451 1       48M12S  *       0       0     AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATCACTCACAGCACGGTTA     A<-FF<FAAA7-7--J<F-JFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFAA   NH:i:3  HI:i:3  NM:i:1  MD:Z:45G2       AS:i:45 nM:i:1  jM:B:c,-1       jI:B:i,-1       UG:i:0  XM:Z:GTGAGTGA
K00201:191:HNVFNBBXX:3:1215:10612:16155_TCTGGTTG        0       chr1    3736437 255     79M21S  *       0       0     TTCCACGAATCCAGCCCTTCAAAGGATAACACCAGAAAAAAAAAAATACAAGGACAGAAACCACGACCTAGAAAAAGCAGTGGTATCAACGCAGAGTACA     <<<-FF-FJ<J<JFJFJAAFJJJJJJJJJJJFFJJJFFFFJJJJJJFJJJF<JFJFJJFJJFFFJJJFJFJFJJJFAFJF7FJJJJJJJA7A<FJ<7FFF   NH:i:1  HI:i:1  NM:i:0  MD:Z:79AS:i:77 nM:i:0  jM:B:c,-1       jI:B:i,-1       UG:i:1  XM:Z:TCTGGTTG
K00201:191:HNVFNBBXX:3:2222:19441:34653_TCTGGTTG        0       chr1    3736437 255     79M21S  *       0       0     TTCCACGAATCCAGCCCTTCAAAGGATAACACCAGAAAAAAAAAAATACAAGGACAGAAACCACGACCTAGAAAAAGCAGTGGTATCAACGCAGAGTACA     A<77FF<JJJJFJJJFJFFFFJFFJJJJJJJJFJJJJJJJJJJJJJFJJJJFJJJJJJJJJJJF7FJJFJJJJJJJAJFFFJFJJJJJJJJJJJJAAFFF   NH:i:1  HI:i:1  NM:i:0  MD:Z:79AS:i:77 nM:i:0  jM:B:c,-1       jI:B:i,-1       UG:i:1  XM:Z:TCTGGTTG

Unfortunately, UMIs are still not recognized....:

2018-07-25 09:59:17,265 - DEBUG - Reading /ifs/illumina/vagnec/Analyses/test_velocyto/bamfiles_test_XM/LNMA01_XM.pos.bam 2018-07-25 09:59:17,267 - DEBUG - Read first 0 million reads 2018-07-25 09:59:17,459 - DEBUG - Counting for batch 1, containing 0 cells and 0 reads 2018-07-25 09:59:17,459 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-25 09:59:17,459 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-25 09:59:17,460 - WARNING - 0 of the barcodes where without cell 2018-07-25 09:59:17,460 - DEBUG - Reading /ifs/illumina/vagnec/Analyses/test_velocyto/bamfiles_test_XM/LNMA02_XM.pos.bam 2018-07-25 09:59:17,461 - DEBUG - Read first 0 million reads 2018-07-25 09:59:17,465 - DEBUG - Counting for batch 2, containing 0 cells and 0 reads 2018-07-25 09:59:17,465 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-25 09:59:17,466 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-25 09:59:17,466 - WARNING - 0 of the barcodes where without cell 2018-07-25 09:59:17,466 - DEBUG - 13753 reads were skipped because no apropiate cell or umi barcode was found 2018-07-25 09:59:17,466 - DEBUG - Counting done! Traceback (most recent call last): File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/bin/velocyto", line 11, in load_entry_point('velocyto==0.17.8', 'console_scripts', 'velocyto')() File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/run.py", line 113, in run samtools_memory=samtools_memory, dump=dump, verbose=verbose, additional_ca=additional_ca) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/_run.py", line 236, in _run logging.debug(f"Example of barcode: {valid_bcs_list[0]} and cell_id: {valid_cellid_list[0]}") IndexError: list index out of range

I launched the same command with the option "--without-umi" and the job terminated sucessfully. So I really think the problem is with the recognition of UMIs.

Thank you in advance for your help

PS : my command: velocyto run --onefilepercell -o output2 bamfiles_test_XM/*XM.pos.bam /ifs/illumina/share/Utilities/Genomes/Mus_musculus/mm10/Annotations/Ensembl/Mus_musculus.GRCm38.90_UCSConlychr.gtf

gioelelm commented 6 years ago

I think you stumbled upon an actual bug. I fixed it in 0.17.9, however, I had some problems to upload it on pypi, I will solve within 24h, but for now, you will have to install from source.

gioelelm commented 6 years ago

Please close the issue if the bug is solved, otherwise let me know

vagnec commented 6 years ago

Hi gioelelm, Yes, it worked with the new version. Thank you !

annajbott commented 5 years ago

Hi,

I put the UMI in the XM flag. Here is how my bam files look:

K00201:191:HNVFNBBXX:1:1225:27600:17034_GTGAGTGA        272     chr1    3329451 1       48M12S  *       0       0     AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATCACTCACAGCACGGTTA     A<-FF<FAAA7-7--J<F-JFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFAA   NH:i:3  HI:i:3  NM:i:1  MD:Z:45G2       AS:i:45 nM:i:1  jM:B:c,-1       jI:B:i,-1       UG:i:0  XM:Z:GTGAGTGA
K00201:191:HNVFNBBXX:3:1215:10612:16155_TCTGGTTG        0       chr1    3736437 255     79M21S  *       0       0     TTCCACGAATCCAGCCCTTCAAAGGATAACACCAGAAAAAAAAAAATACAAGGACAGAAACCACGACCTAGAAAAAGCAGTGGTATCAACGCAGAGTACA     <<<-FF-FJ<J<JFJFJAAFJJJJJJJJJJJFFJJJFFFFJJJJJJFJJJF<JFJFJJFJJFFFJJJFJFJFJJJFAFJF7FJJJJJJJA7A<FJ<7FFF   NH:i:1  HI:i:1  NM:i:0  MD:Z:79AS:i:77 nM:i:0  jM:B:c,-1       jI:B:i,-1       UG:i:1  XM:Z:TCTGGTTG
K00201:191:HNVFNBBXX:3:2222:19441:34653_TCTGGTTG        0       chr1    3736437 255     79M21S  *       0       0     TTCCACGAATCCAGCCCTTCAAAGGATAACACCAGAAAAAAAAAAATACAAGGACAGAAACCACGACCTAGAAAAAGCAGTGGTATCAACGCAGAGTACA     A<77FF<JJJJFJJJFJFFFFJFFJJJJJJJJFJJJJJJJJJJJJJFJJJJFJJJJJJJJJJJF7FJJFJJJJJJJAJFFFJFJJJJJJJJJJJJAAFFF   NH:i:1  HI:i:1  NM:i:0  MD:Z:79AS:i:77 nM:i:0  jM:B:c,-1       jI:B:i,-1       UG:i:1  XM:Z:TCTGGTTG

Unfortunately, UMIs are still not recognized....:

2018-07-25 09:59:17,265 - DEBUG - Reading /ifs/illumina/vagnec/Analyses/test_velocyto/bamfiles_test_XM/LNMA01_XM.pos.bam 2018-07-25 09:59:17,267 - DEBUG - Read first 0 million reads 2018-07-25 09:59:17,459 - DEBUG - Counting for batch 1, containing 0 cells and 0 reads 2018-07-25 09:59:17,459 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-25 09:59:17,459 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-25 09:59:17,460 - WARNING - 0 of the barcodes where without cell 2018-07-25 09:59:17,460 - DEBUG - Reading /ifs/illumina/vagnec/Analyses/test_velocyto/bamfiles_test_XM/LNMA02_XM.pos.bam 2018-07-25 09:59:17,461 - DEBUG - Read first 0 million reads 2018-07-25 09:59:17,465 - DEBUG - Counting for batch 2, containing 0 cells and 0 reads 2018-07-25 09:59:17,465 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-25 09:59:17,466 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-25 09:59:17,466 - WARNING - 0 of the barcodes where without cell 2018-07-25 09:59:17,466 - DEBUG - 13753 reads were skipped because no apropiate cell or umi barcode was found 2018-07-25 09:59:17,466 - DEBUG - Counting done! Traceback (most recent call last): File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/bin/velocyto", line 11, in load_entry_point('velocyto==0.17.8', 'console_scripts', 'velocyto')() File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/run.py", line 113, in run samtools_memory=samtools_memory, dump=dump, verbose=verbose, additional_ca=additional_ca) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/_run.py", line 236, in _run logging.debug(f"Example of barcode: {valid_bcs_list[0]} and cell_id: {valid_cellid_list[0]}") IndexError: list index out of range

I launched the same command with the option "--without-umi" and the job terminated sucessfully. So I really think the problem is with the recognition of UMIs.

Thank you in advance for your help

PS : my command: velocyto run --onefilepercell -o output2 bamfiles_test_XM/*XM.pos.bam /ifs/illumina/share/Utilities/Genomes/Mus_musculus/mm10/Annotations/Ensembl/Mus_musculus.GRCm38.90_UCSConlychr.gtf

Hi,

I put the UMI in the XM flag. Here is how my bam files look:

K00201:191:HNVFNBBXX:1:1225:27600:17034_GTGAGTGA        272     chr1    3329451 1       48M12S  *       0       0     AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAATCACTCACAGCACGGTTA     A<-FF<FAAA7-7--J<F-JFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFFAA   NH:i:3  HI:i:3  NM:i:1  MD:Z:45G2       AS:i:45 nM:i:1  jM:B:c,-1       jI:B:i,-1       UG:i:0  XM:Z:GTGAGTGA
K00201:191:HNVFNBBXX:3:1215:10612:16155_TCTGGTTG        0       chr1    3736437 255     79M21S  *       0       0     TTCCACGAATCCAGCCCTTCAAAGGATAACACCAGAAAAAAAAAAATACAAGGACAGAAACCACGACCTAGAAAAAGCAGTGGTATCAACGCAGAGTACA     <<<-FF-FJ<J<JFJFJAAFJJJJJJJJJJJFFJJJFFFFJJJJJJFJJJF<JFJFJJFJJFFFJJJFJFJFJJJFAFJF7FJJJJJJJA7A<FJ<7FFF   NH:i:1  HI:i:1  NM:i:0  MD:Z:79AS:i:77 nM:i:0  jM:B:c,-1       jI:B:i,-1       UG:i:1  XM:Z:TCTGGTTG
K00201:191:HNVFNBBXX:3:2222:19441:34653_TCTGGTTG        0       chr1    3736437 255     79M21S  *       0       0     TTCCACGAATCCAGCCCTTCAAAGGATAACACCAGAAAAAAAAAAATACAAGGACAGAAACCACGACCTAGAAAAAGCAGTGGTATCAACGCAGAGTACA     A<77FF<JJJJFJJJFJFFFFJFFJJJJJJJJFJJJJJJJJJJJJJFJJJJFJJJJJJJJJJJF7FJJFJJJJJJJAJFFFJFJJJJJJJJJJJJAAFFF   NH:i:1  HI:i:1  NM:i:0  MD:Z:79AS:i:77 nM:i:0  jM:B:c,-1       jI:B:i,-1       UG:i:1  XM:Z:TCTGGTTG

Unfortunately, UMIs are still not recognized....:

2018-07-25 09:59:17,265 - DEBUG - Reading /ifs/illumina/vagnec/Analyses/test_velocyto/bamfiles_test_XM/LNMA01_XM.pos.bam 2018-07-25 09:59:17,267 - DEBUG - Read first 0 million reads 2018-07-25 09:59:17,459 - DEBUG - Counting for batch 1, containing 0 cells and 0 reads 2018-07-25 09:59:17,459 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-25 09:59:17,459 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-25 09:59:17,460 - WARNING - 0 of the barcodes where without cell 2018-07-25 09:59:17,460 - DEBUG - Reading /ifs/illumina/vagnec/Analyses/test_velocyto/bamfiles_test_XM/LNMA02_XM.pos.bam 2018-07-25 09:59:17,461 - DEBUG - Read first 0 million reads 2018-07-25 09:59:17,465 - DEBUG - Counting for batch 2, containing 0 cells and 0 reads 2018-07-25 09:59:17,465 - DEBUG - 0 reads not considered because fully enclosed in repeat masked regions 2018-07-25 09:59:17,466 - WARNING - The barcode selection mode is off, no cell events will be identified by <80 counts 2018-07-25 09:59:17,466 - WARNING - 0 of the barcodes where without cell 2018-07-25 09:59:17,466 - DEBUG - 13753 reads were skipped because no apropiate cell or umi barcode was found 2018-07-25 09:59:17,466 - DEBUG - Counting done! Traceback (most recent call last): File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/bin/velocyto", line 11, in load_entry_point('velocyto==0.17.8', 'console_scripts', 'velocyto')() File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 1066, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/run.py", line 113, in run samtools_memory=samtools_memory, dump=dump, verbose=verbose, additional_ca=additional_ca) File "/ifs/illumina/vagnec/Analyses/software/miniconda/envs/velocyto/lib/python3.6/site-packages/velocyto/commands/_run.py", line 236, in _run logging.debug(f"Example of barcode: {valid_bcs_list[0]} and cell_id: {valid_cellid_list[0]}") IndexError: list index out of range

I launched the same command with the option "--without-umi" and the job terminated sucessfully. So I really think the problem is with the recognition of UMIs.

Thank you in advance for your help

PS : my command: velocyto run --onefilepercell -o output2 bamfiles_test_XM/*XM.pos.bam /ifs/illumina/share/Utilities/Genomes/Mus_musculus/mm10/Annotations/Ensembl/Mus_musculus.GRCm38.90_UCSConlychr.gtf

Hi, I am using UMI tools (/Alevin with no quant output) and have the UMIs and CBs located in the name of the read too and need them as tags for velocyto. Did you use a tool to add the barcodes as tags? I've been reading a lot of documentation but can't seem to find anything that will do this efficiently (I might not have been looking very well though haha). I'm looking for an approach that is generalisable and not too slow. Thanks!

Anna

massonix commented 4 years ago

Hi,

I'm getting exactly the same error but with 10X data

Genki-YAN commented 3 years ago

Hi,

I'm getting exactly the same error but with 10X data

Hi, I have the same error with 10X data too. Have you solved the problem yet? Can you give me some advice?