NoahHenrikKleinschmidt / qpcr

A python package to analyse qPCR data for single-use or high-throughput application
GNU General Public License v3.0
21 stars 5 forks source link

General questions and suggestions #1

Closed csamuel11 closed 2 years ago

csamuel11 commented 2 years ago

Nice project! I have a few questions and suggestions that I hope you may find useful.

  1. Why is the title of your y-axis "ddCT", rather than something like "Normalized fold expression"? Are you plotting ddCT or 2^-ddCT?
  2. Could you change the bar graphs to dot plots (i.e., univariate scatterplots), as described here? Especially for small sample sizes (n<10 biological replicates per group), dot plots are nice since they show the data distribution. Alternatively, could you make an option for users to choose between bar graphs and dot plots?
  3. How are you calculating ddCT exactly? Are you using the Livak method, as described here? I am just wondering since there is a recent paper that gives a step-by-step explanation (see Figure 5) that new qPCR users might find helpful. Also, this method accommodates multiple reference genes and accounts for the different measurement scales between ddCT (continuous) and 2^-ddCT (ratio).
  4. Do you plan to add statistics to compare the user's groups?
NoahHenrikKleinschmidt commented 2 years ago

Hi Samuel,   Thanks a lot for your suggestions, that's really great :-)   About the Delta-Delta-Ct computation. Yes, the module uses the Livak method for computing Delta-Delta-Cts with the slight exception that the efficiency is incorporated via 2 x efficiency instead of ( 1 + efficiency ) as Livak et al. describe in their paper. It is important to note that the module by default incorporates the exponential term already at Delta-Ct computation, hence both dCt and ddCt values that are stored by qpcr are exponential terms.  This is mentioned in the documentation but I agree this should probably take more prominence.   About the labelling, you are totally right, a label such as “normalised fold change” would be way more appropriate. This has actually already been adapted for the next planned release.     The plot you suggest seems like a very useful visualisation. In fact, it is the kind of plot (dots overlaid on top of the bars) which the people from our lab usually produce using Graphpad Prism, ultimately. Given the plots were designed to be just previews, I so far didn’t really plan on trying to implement this, but since I now got feedback from you that this would be nice to have, I’ll gladly work on it. I’m not yet sure if it’ll be a stand-alone plot or an option to add to the current PreviewResults, probably the former.   Yes, some statistics are planned. I have thought of adding some evaluations like t-tests, but there’s a few issues with pairing relevant / interesting groups (and/or assays for that matter). So far, I haven’t found a nice / automatable way to handle pairing groups that I’m happy with (I mean letting users write dictionaries is always an option, but I’d prefer something smoother – do you have an idea?). So, I think it will be a while until stats are implemented, but they are on the to-do list.   Cheers, Noah  

daisysotero commented 1 year ago

Hello, this is an excellent tool for qPCR analysis. I have some doubts and would like to clarify them, if possible:

  1. Is it feasible to incorporate technical replicates into my dataset and have the software handle the calculation of averages, or does it exclusively work with biological replicates?

  2. The Y axis mentioned in the previous discussions is "Normalized Fold Change". In your graph, all conditions (WT-, WR+, KO-, KO+) present a "normalized Fold Change", so I would like to ask which of these conditions is the reference (control) used? Would it be the one that shows "normalized Fold Change" = 1?

  3. Is it possible to produce boxplot instead of bars?

  4. I can do the ANOVA analysis, see if there is any difference. However, is it possible to apply a post-hoc test and visualize the graph with the differences from this post-hoc test?

Thank you and I look forward to your responses.

NoahHenrikKleinschmidt commented 1 year ago

Heyhow, Thanks, glad you like it 🌼

I thought about your questions and added two small features that may make it easier for you to achieve your goals - maybe go install the newest version v.4.1.3 to make use of them ;-)

So, regarding your questions:

  1. The software is designed for working with one type of replicate that it can handle "out-of-the-box", if these are biological or technical will depend on the respective user's experimental setup. Even though I did not implement anything tailored to handle a double-replicate setup directly, qpcr is capable of doing so, with a bit of creativity. Things I can think of to do this:

    • I think the easiest solution to such a thing might be to load the biological replicates into separate Assays, process them normally (i.e. on a technical-replicate basis), then get their dataframes, average them, and make a new Assay from the average. The code may look something like this (using v.4.1.3):

      # get two Assay objects for the biological replicates
      bio_replicate_1 = qpcr.read(...)
      bio_replicate_2 = qpcr.read(...)
      
      # do pre-processing such as Delta-Ct 
      # (on the technical replicates within each bio-replicate) 
      bio_replicate_1 = qpcr.delta_ct(bio_replicate_1, additional_arguments...)
      bio_replicate_2 = qpcr.delta_ct(bio_replicate_2, additional_arguments...)
      
      # now average the two assays'  data
      # make a new assay that has the same data as the original two
      merged = bio_replicate_1.copy()
      # and set the dCt to the average of the two bio-replicates
      merged.dCt = (bio_replicate_1.dCt + bio_replicate_2.dCt) / 2
      
      # now perform the rest of analysis
      results = qpcr.normalise(merged, some_normaliser, additional_arguments...)
    • Alternatively, if you have the data in one single Assay table setup for which you know how to specify rules regarding Delta-Ct computation, you may make a custom Analyser instance (the object behind the qpcr.delta_ct function) and set its anchor or func attributes to your custom function(s) to adjust its behavior accordingly (the same thing goes for the Normaliser object as well by the way). Using this fairly technical approach, you can actually make qpcr do whatever you want pretty much, but it requires some technical know-how around your problem and respective data setup, so I cannot make a simple example snippet for this one 😅
    • Of course, you can always use get to get the pure pandas dataframes, manipulate all you like and then use adopt to re-integrate the data into your Assay.
      
      bio_replicate_1 = qpcr.read(...)

    df_1 = bio_replicate_1.get()

    manipulate freely

    ...

    reintegrate the data

    bio_replicate_1.adopt(manipulated_df_1)

    continue analysis ...

  2. The answer to the next one is: yes, the reference is WT- where all bars are at 1.

  3. The short answer to the Boxplot question is: nope, pre-implemented are only bar plots and violinplots. If violin plots are alright for you, you can use your_results.preview(kind="dots", violin=True) to make violin plots (this just uses sns.violinplot underneath the surface). If it needs to be boxplots, then you can use get to get the pandas dataframe and use seaborn directly 😄 - I'm afraid the plotters do not (yet?) support customization when it comes to the actual visualization functions that are used, sorry 😅

  4. Finally, regarding post-hoc tests, qpcr.stats itself only forwards to statsmodels and scipy, so there isn't a lot of actual code implemented by qpcr itself here. You are likely quicker using these libraries directly once you have the results of your comparisons. I know visualization can be a pain when it comes to statistical test results - I would recommend the statannotations library, that has some nice extensions to help integrate statistic annotations to matplotlib figures. However, I'm curious what you were thinking of, and if it may be interesting to add such features in a future version, can you tell me more?

I hope that was helpful 😅

Cheers, Noa ☀️

daisysotero commented 1 year ago

Hello, I appreciate your prompt response to questions 2 and 3, as well as for providing the code for question 1.

With regards to question 4, my intention was to create bar graphs that visually represent the differences, much like a t-test, but specifically highlighting the results from a post-hoc test such as Tukey (for exemplo). This feature would be valuable for whom seeking to compare more than two groups.

Once more, thank you very much.