Low diversity - Githubissues

NikBak123 commented 4 months ago

Good morning,

We have been using the Clonmapper protocol and we are at the stage of testing our barcode diversity after electroporation. We did a 138M reads sequencing and unfortunately our diversity was very low (~500,000). Our transformation efficiency after the electroporation was low too, which gave us an indication that we'd have low diversity, but we further confirmed this with Illumina sequencing. We have tried multiple times electroporating and have never got the efficiency over 10^6 and we have been using the ElectroMAX Stbl4 competent cells.

First, what transformation efficiency do you consider good after the electroporation? Do you have any advice on what might be going wrong with our experiment's efficiency? I saw that you recommended the Endura electrocompetent cells here, so we might try changing to these, however I'm not sure this would be the sole change to increase our efficiency.

Additionally, we seem to be losing a lot of DNA during the gel extraction (~60% loss). Is this something you encountered? Any ideas on how we can improve this would be very helpful.

We really appreciate any help and advice :)

Thank you, Nikolina

daylinmorgan commented 3 months ago

Sorry to hear you are struggling with library diversity! See below for some possible suggestions and follow-ups.

We have been using the Clonmapper protocol and we are at the stage of testing our barcode diversity after electroporation. We did a 138M reads sequencing and unfortunately our diversity was very low (~500,000).

This sequencing was performed prior to electroporation? Was the backbone fully digested and gel-extracted?

We have tried multiple times electroporating and have never got the efficiency over 10^6 and we have been using the ElectroMAX Stbl4 competent cells.

Have you tested your electroporation with a control plasmid to rule out this being an issue?

First, what transformation efficiency do you consider good after the electroporation?

A typical transformation efficiency we have seen is ~0.2e9 cfu/ug.

Additionally, we seem to be losing a lot of DNA during the gel extraction (~60% loss). Is this something you encountered?

We usually see loss of DNA in the gel extraction as well. I would ensure the gel is fully dissolved, pre-heat an elution buffer to 50C, and incubate with elution buffer on the column for 5-10 minutes.

Have you performed digestion following the golden gate assembly to confirm insertion of the barcodes?

NikBak123 commented 3 months ago

Hi Daylin,

Thank you so much for taking the time to reply : ) Please find the answers to your questions below:

This sequencing was performed prior to electroporation? Was the backbone fully digested and gel-extracted? The sequencing was performed after electroporation and after the backbone was digested and gel extracted. Assuming that after electroporation only undigested or barcoded plasmids can transform the bacteria, before sending the pool for sequencing, we used the MfeI restriction enzyme to get an idea of whether the barcode-plasmid assembly had been successful. The MfeI enzyme recognises 3 restriction sites on the Crop-seq plasmid, but one of them is on the small fragment that gets discarded during gel extraction, meaning that if we had a successful assembly we’d get 2 bands during gel electrophoresis and 3 bands if the plasmid had not been digested. This showed that our plasmid and barcodes had been assembled. Have you tested your electroporation with a control plasmid to rule out this being an issue? We have not and this is the first step I was going to test as I believe this is where the problem is for us. Not necessarily with the electroporation settings but maybe with the amount of plasmid we are adding to our bacteria. Could I please ask how much plasmid DNA do you add to your bacteria for electroporation? We added the 5ul stated in the protocol and tried different amounts of plasmid (100ng, 150ng and 815ng) in 100ul of bacteria and found that the lowest performed better. However, we didn’t have enough plasmid for more runs so I will be restarting the process, but it would be very helpful to know an estimate while we’re optimising.

Lastly, when running the data analysis using Pycashier we were getting an error saying: ‘no barcodes passed the final length and abundance filters’. We didn’t really find a way around it yet, or what might be causing this so any insight would be much appreciated. I believe my colleague that was doing the analysis, disregarded the error and we did get some results, but the error was there so I’m unsure on how accurate they are. Also the representation of the barcodes we got was not good as we had half of them at 7-10k copies and the other half at 1-10 copies. I don’t know if this is something that could improve when our diversity improves.

Apologies for the really long message. Any tips or insight would be much appreciated. Thank you in advance!

Best wishes, Nikolina

daylinmorgan commented 3 months ago

Could I please ask how much plasmid DNA do you add to your bacteria for electroporation?

I'd like to offer more information here but I have not quantified the final amount of DNA in a typical reaction that is then used in the follow up transfection. I agree it is a good idea to try to optimize this for your cells/electroporator, though.

Lastly, when running the data analysis using Pycashier we were getting an error saying: ‘no barcodes passed the final length and abundance filters’.

This is happens when sequencing from highly-diverse plasmid libraries. There are two final filter steps before counts are written to the outs/ (or --output) directory:

The length of sequences must be the provided length (default: 20) +/- the offset (default: 1)
They must be above a minimum abundance, by default 0.005% of the total number of reads.

So if you sequenced at a depth of 138M reads and most of those passed initial quality filters then barcodes would need to be found in more than 6900 reads, but with a diverse plasmid library we don't expect this to be the case.

Typically, when we sequence a high diversity library we perform no abundance filtering as we are far from sequencing saturation of the overall diversity. This means we pass --filter-count 0 so that no abundance filtering is performed on the raw barcode totals.

Also the representation of the barcodes we got was not good as we had half of them at 7-10k copies and the other half at 1-10 copies.

You can see panels C and D from this supplementary figure for an idea of the typical results we get when sequencing plasmid library. While some barcodes are detected at higher frequency (on the order of 1x10^4) most are detected in < 10 reads.

What is the total number of reads from your library after pycashier (with --filter-count 0)? I would expect you to have detected more than 500,000 barcodes if (half are ~7-10K and the other half 1-10) unless your sequencing was low quality and most reads were removed in the PHred 30 cutoff?

I'll plan to update the documentation and warnings emitted by pycashier to help users catch a scenario where they may be sequencing a plasmid library and should not be abundance filtering.

NikBak123 commented 3 months ago

Hi Daylin,

Thank you for your reply once again. The pycashier analysis gave us ~550,000 unique barcodes (length 20bp (+/-) and abundance filter 0), which is too low to continue with the viral transfection. The library was high quality so we didn't lose many reads because of that. At this point we are pretty sure that the main issue is the electroporation and the need for it to be optimised.

However, we are also interested in discussing with you any options of purchasing the assembled library from your lab, as it would really help us with meeting PhD and project deadlines. Please let me know if this is possible and if so could you please email me and my PI, Professor of Medical Oncology, Michelle Lockley, to discuss this further?

My email: n.bakali@qmul.ac.uk Prof. Michelle Lockley's email: m.lockley@qmul.ac.uk

Looking forward to your reply!

Kind regards, Nikolina

brocklab / clonmapper

Low diversity #3