shimlab / BLAZE

SingleCell Nanopore sequencing data analysis
GNU General Public License v3.0
46 stars 6 forks source link

Support for 5' 10X kits #13

Open EdGreen21 opened 7 months ago

EdGreen21 commented 7 months ago

Interesting project! We were looking at available longread tools that might be applicable to 10X 5' VDJ data - looks like BLAZE could be a good startpoint re-utilising modified find_adaptor code to identify the primer sequences used to enrich the VDJ sequences. Not a general solution to 5' 10X of course

atrull314 commented 4 months ago

Hi @youyupei , Thanks so much for the tool! We are currently using BLAZE as part of our nf-core/single-cell nanopore pipeline here: https://github.com/nf-core/scnanoseq/tree/dev

We are also looking for support for the 5' 10X to broaden the kits we support. We've forked this repo and begun work adding 5' support in that fork. Would you be interested in us submitting the PR here for review or if you would like to review in the forked repo itself I can add you?

Thanks!

youyupei commented 4 months ago

Hi @atrull314, thanks for being interested in BLAZE. It would be great if you could submit PR for this.

youyupei commented 3 months ago

Thanks to @atrull314, I have merged @atrull314's PR to a new branch support-5prim-kit

avilella commented 1 month ago

I tried this with data that was run a few days ago. It seems to work and I am showing below the results of 2 different basecalling methods:

blaze --kit-version 5v3 --expect-cells 10000 --threads 32 $PWD

A) Basecalling from MinKNOW, which gives (61.25% of all reads) with unambiguous polyT and adapter positions found B) Basecalling with latest dorado sup, which gives (74.44% of all reads) with unambiguous polyT and adapter positions found.

So the dorado basecaller gives an extra 13% calling on this 5'GEX dataset.

Thank you all involved in BLAZE and supporting the 5' 10X kits.