DavidsonGroup / flexiplex

The Flexible Demultiplexer
https://davidsongroup.github.io/flexiplex/
MIT License
23 stars 2 forks source link

UMI before barcode #20

Closed yxsee closed 8 months ago

yxsee commented 10 months ago

Hi @nadiadavidson , In my single cell long-reads dataset (Split-seq), the UMI sequence comes before the barcode (as shown below). Is it possible to specify the orientation of barcode and UMI between the flanking sequences? I tried using the reverse complements of flanking and barcode sequences and was able to extract the UMI and barcode sequences, but with -i true my read is trimmed instead of the barcode. image

nadiadavidson commented 10 months ago

Hi @yxsee ,

Many thanks for your question. We're quite keen to get flexiplex to work with split-seq, so this is very useful. I think the simplest solution will be for us to build in an option to trim off the sequence to the opposite direction, so you can run it with the reverse compliments of the barcode and flanking sequencing like you tried. If you have any test data you would like to share to help us develop this option (and update our documentation for split-seq), please feel free to post or email (address in the paper pre-print) to me.

Cheers, Nadia.

nadiadavidson commented 8 months ago

Just an update about this issue, @yxsee.

We are working on a new version of flexiplex (thanks to @ChangqingW) which should allow more complex structures of the barcode region. This code is now in the repository (main) if you would like to clone it and test it out. The code is currently developmental, so please feedback any issues/bugs you come across.

With the new version you can pass the pattern like: -x "[left flank]" -u ??? -b ???? -x "[right flank]" Where you should repeat "?" for the number of bases you expect for the UMI and barcode. Not that the order of these input parameters now matters!