jpquast / protti

Picotti lab data analysis package.
https://jpquast.github.io/protti/
Other
57 stars 4 forks source link

Query regarding the drc_4p_plot #204

Closed BaylorSci closed 5 months ago

BaylorSci commented 1 year ago

First, thank you for this package. I have been using it to look at proteomics in some experiments, in particular using the drc_4p_plot function after running the fit_drc_4p function ( replicate_completeness = 0.7, condition_completeness = 0.5, correlation_cutoff = 0.7).

When I plot the data, I noticed that the confidence interval does not run over all the samples.

Exploring this further by looking at the plot_curve and plot_points output, I noticed that only 13 points were output in plot_points output, but the graph output shows 17 points (attched plot). I had assumed that the plot_points represented the plotted points, while the plot_curve plotted the confidence intervals, so i'm confused where the extra points came from? I am wondering is this a bug or are some of the plotted points predictions/extrapolations?

I am running version 0.6.0. dce5a23e-5105-4529-8ede-d5109e62b822

jpquast commented 1 year ago

Thanks for posting the issue!

Regarding the confidence interval. I have also seen that a few times and I might have a closer look at it. I am not sure if it is a real bug or just related to there being too few data points for a proper confidence interval calculation.

Regarding the points, you are correct. plot_points contains the blue points you see in the figure. plot_curve contains the the points of the curve and the confidence interval. There is no prediction going on so it is maybe a bug.

If you only fit a curve for this specific protein is it correct in that case? How did you exactly obtain the plot_points data frame in which you only saw 13 points?

There is a bug in RStudio that could explain the discrepancy. If you filter a data frame in RStudio and then click on the plot_points data frame within your filtered data frame it will return something like this: View(drc_data[[24]][[1]]). In this case it selects the information from column 24 (plot_points) and row 1. However when you filtered before your current row 1 is not the actual row one of the drc_data data frame so you get the wrong sub data frame. Not sure if this makes sense. So the best way is the manually filter the data frame for your protein of interest e.g. with filter() and then check the plot_points again.

Please let me know if this was the problem or if there is indeed a bug.

jpquast commented 5 months ago

I will close this due to inactivity.