timoast / sinto

Tools for single-cell data processing
https://timoast.github.io/sinto/
MIT License
112 stars 24 forks source link

error parsing barcode from middle of read name with nametotag #46

Closed jkgrenier closed 2 years ago

jkgrenier commented 2 years ago

Hi @timoast,

Thanks for adding nametotag, this is very helpful for my dataset (linked-reads sharing the same molecular barcode). I currently have mapped reads in a bam file with the barcode embedded in the read header, same format as isssue #32 where the barcode is not at the beginning of the read. example (barcode = CATTTGGCCTCGAATCGCGTCGGTGCGGTAACACTC) A00564:478:HG5NJDSX3:1:2556:3992:34867_CATTTGGCCTCGAATCGCGTCGGTGCGGTAACACTC_GAACGACTACCACAG

You provided the regex to use: --barcoderegex "(?<=)(.*)(?=_)" which works for me with sinto fragments, however I get an error with sinto nametotag:

Traceback (most recent call last): File "/programs/sinto-0.8.0/bin/sinto", line 8, in sys.exit(main()) File "/programs/sinto-0.8.0/lib/python3.9/site-packages/sinto/arguments.py", line 457, in main options.func(options) File "/programs/sinto-0.8.0/lib/python3.9/site-packages/sinto/utils.py", line 23, in wrapper func(args) File "/programs/sinto-0.8.0/lib/python3.9/site-packages/sinto/cli.py", line 109, in run_nametotag tagtoname.move( File "/programs/sinto-0.8.0/lib/python3.9/site-packages/sinto/tagtoname.py", line 51, in move cell_barcode = re_match.group() AttributeError: 'NoneType' object has no attribute 'group'

I get a similar error for filterbarcodes using the same regex and input bam file.

Im running sinto v0.8.0

Thanks for your help!

timoast commented 2 years ago

Hi @jkgrenier, I think I know what the issue was. Can you try installing from the develop branch and see if that fixes the issue?

jkgrenier commented 2 years ago

Will do (tomorrow)! Thanks for the quick response!

timoast commented 2 years ago

Hi @jkgrenier, any update on this?

jkgrenier commented 2 years ago

Yes! Sorry for the slow reply. After installing from the development branch, 'nametotag' ran with no errors. That provided the solution I needed! However, I also tried 'filterbarcodes' which still threw an error. With the 'nametotag' solution, I can now filter on the tag instead of on readname(+regex), but you may want to look into that.

Thank you!

timoast commented 2 years ago

Got it, thanks. Should be fixed now