alexdobin / STAR

RNA-seq aligner
MIT License
1.84k stars 505 forks source link

STARsolo-BD Rhapsody compatibility? #1111

Open wajm opened 3 years ago

wajm commented 3 years ago

Hi, Is it possible to use BD Rhapsody UMI on STARsolo? I'm confuse to use --soloType CB_UMI_Complex option. or any other option for this? https://teichlab.github.io/scg_lib_structs/methods_html/BD_Rhapsody.html

fderop commented 3 years ago

Hi Wajm, I have no experience with BD-rhapsody libraries, but I do have experience with CB_UMI_Complex. For BD-rhapsody, I would suggest the options:

        --soloCBwhitelist "${whitelist_part1_filename}" "${whitelist_part2_filename}" "${whitelist_part3_filename}" \
        --soloType CB_UMI_Complex \
        --soloUMIlen 8 \
        --soloCBposition 0_0_0_8 0_21_0_29 0_43_0_51 \
        --soloUMIposition 0_52_0_59 \

I would highly recommend reading the STARmanual, especially on page 56 where the complex CB position is explained. It has worked well for me so far. You can supply the 3 whitelists as text files.

wajm commented 3 years ago

Thank you so much. multi-white lists and multi option are keys.

GlancerZ commented 2 years ago

Hi Wajm, I have no experience with BD-rhapsody libraries, but I do have experience with CB_UMI_Complex. For BD-rhapsody, I would suggest the options:

        --soloCBwhitelist "${whitelist_part1_filename}" "${whitelist_part2_filename}" "${whitelist_part3_filename}" \
        --soloType CB_UMI_Complex \
        --soloUMIlen 8 \
        --soloCBposition 0_0_0_8 0_21_0_29 0_43_0_51 \
        --soloUMIposition 0_52_0_59 \

I would highly recommend reading the STARmanual, especially on page 56 where the complex CB position is explained. It has worked well for me so far. You can supply the 3 whitelists as text files.

Is CBposition different for the paired BD Rhapsody?

curtisd0886 commented 2 years ago

Hello, I have been mapping Rhapsody WTA libraries using STARsolo with great success using the suggestions above however, BD just recently changed the structure of the bead barcodes which complicates things a little bit. They have added staggered nucleotides (see below) at the start of the first barcode. This shifts the reads by 0-4 nucleotides to increase diversity and reduce the amount of PhiX needed, but it also complicates finding the barcodes based solely on base counting. Any ideas on how we could use STARsolo to continue mapping these new types of CBs?

Screen Shot 2022-07-22 at 9 25 20 AM
wajm commented 2 years ago

Dear curtisd0886, I contact to BD Genomics, but they still don't share about the information of enhanced capture beads. Where did you get the image?

curtisd0886 commented 2 years ago

Hey @wajm,

BD shared the file with me. I made it clear that I need to do custom mapping.

wajm commented 2 years ago

@curtisd0886, I have no idea about TCR/BCR handle and Diversity insert. If they have diverse sequence length, we could not point soloCBposition.

alexdobin commented 2 years ago

Hi @wajm and @curtisd0886

I think there are a couple of approaches here:

  1. If CLS1 barcode does not start with A, GT, or TCA (the random bases), then these sequences can be trimmed before mapping with STARsolo.
  2. The constant linker GTGA and GACA, separated by 9 bases, can be used as an anchor to find the CB/UMI start/end positions. However, in principle, this 8-base combination may arise in other locations by chance. If you want to try this option, I can suggest the parameters.
wajm commented 2 years ago

Hi, @alexdobin and @curtisd0886, I got the information about BD enhanced capture beads, following the curtisd0886's image, bead | universal oligo(20bp)+none/A/GT/TCA | CLS1(9bp) | Linker1(4bp) | CLS2(9bp) | Linker2(4bp) | CLS3 (9bp) | UMI (8bp) | 25dT I'm still curious how to deal with four different combination oligos in front of CLSS1 ( universal oligo(20bp)+none/A/GT/TCA). Is there some recommend trimmed tools or STARsolo option? Thanks a lot. JungMo Kim

alexdobin commented 2 years ago

Hi JungMo,

in addition to my suggestions above, you can try the following approach: https://github.com/alexdobin/STAR/issues/1607#issuecomment-1195617236

Cheers Alex

NormaW commented 1 year ago

Hello, I have been mapping Rhapsody WTA libraries using STARsolo with great success using the suggestions above however, BD just recently changed the structure of the bead barcodes which complicates things a little bit. They have added staggered nucleotides (see below) at the start of the first barcode. This shifts the reads by 0-4 nucleotides to increase diversity and reduce the amount of PhiX needed, but it also complicates finding the barcodes based solely on base counting. Any ideas on how we could use STARsolo to continue mapping these new types of CBs?

Screen Shot 2022-07-22 at 9 25 20 AM

Hello @curtisd0886 hello all, As you were successful with beads (v1), could you give me a hint for the whitelist? How did you make them? Thank you very much in advance!

Cheers Norma