Closed nicbarker closed 6 years ago
It really depends how many points you have. When you have one peak, you should either provide, sd.threshold, upper or percentile. The default won't work if the first point is the peak. Again I will look into it whwn I get back
No problem, take your time.
I'd made some changes, could you check to see if it works for you? The purpose of w is for myself in exploring different options, as you see w=1, and it can't be changed.
Hi, Just to make sure, this is the commit you're referring to: https://github.com/mehrnoushmalek/flowDensity/commit/175b84fc89a870ba5594cbd282430269eb79d428
While this may solve the problem when there's a single peak at the first position, there are still going to be edge cases if there are multiple peaks. For example, consider this very rudimentary density example:
position | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|---|
density | 5 | 4 | 1 | 1 | 1 | 4 | 1 | 1 |
In this case, the getPeaks function will discover the peak at position 5 (the second 4), but not the peak at position 0. This density function will then be incorrectly label this as a single peak distribution, when in reality there are two peaks. (one at position 0, the other at position 5).
The changes that you made will also mean that now if there is a distribution that legitimately has no peaks in it:
position | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|---|
density | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
The max of density will be returned, which is 1 in this case.
I wouldn't consider position 0 to be a peak anyway, based on your first example, cause that's not a peak (never had this in flowData), but with the current changes it would be considered a peak. Also could you just send me the vector or your flowFrame, that makes things easier comparing to me trying to replicate your data with these examples.
If you look at this example, you'll see that the second peak is recorded, but the first point wouldn't be considered a peak.
getPeaks(dens.obj)--- $Peaks [1] -1.038334 $P.ind [1] 18 $P.h [1] 0.00185
To give you a more concrete example of where this is happening, have a look at this case.
This is a flowjo biexp histogram of ~300,000 events using the Live / Dead Near IR marker.
Because the R dens()
function returns 512 density probabilities:
> length(dens$y)
[1] 512
Even though it's obvious that there's a peak when you graph using flowjo, when you reduce the density function to only 512 points of resolution, the peak that is between -60 and about 140 gets compressed into position [1]
of the density function (given that it's 512 equally spaced points over a range of ~262k, the resolution of each point is about 500, which is much larger than the width of the peak itself).
So even though there is a real peak here that humans can see, it gets crushed into the first element of the probability array and as a result ignored completely. This causes flowdensity to make the cutoff call in the wrong place, because it guesses after the second peak rather than between the first and second peaks.
My apologies if I'm not quite understanding how flowdensity is supposed to work - I'll see if I can upload some data today for you to have a look yourself. It's also possible that I'm not aware of some options that I can specify to give the density function better resolution closer to zero on the x axis.
Have you encountered problems with the resolution of the density function before? i.e. having one or several peaks very close to zero that users are visualising using the flowjo biexp scale?
Hi, I've been doing some more experimentation on our end and I realised that I can solve this problem by doing a biexponential transform (it's built into flowjo) on the data before running it through flowdensity. This will spread the data out across 1024 bins and give me the resolution close to zero that I need in this case. Thanks for all the help, feel free to close the issue if you think the root cause isn't that important.
Hello Nic,
Sorry for not replying, I was about to tell you that it's probably some issue with transformation. When you open it on flowJo, there's the T custom scale, that you can change till you get the right scaling for the peaks. Also in R you can try to use flowJoTrans() or estimateLogicle on your raw data. I will work on the getPeaks function for those cases anyway, but please let me know if you have any other issue.
Cheers,
Hi, If you take a look at the comparison done on
/R/helper_functions.R:414
, I believe it will always return null if the single peak is the first element of thedens
array.For example, given the following data in
dens
, noting that element[1]
is the largest,Plotted it looks like this:
The comparison is done as follows:
Solved for the first iteration of the loop
i = 1
and the hardcoded value ofw = 1
Note that because of the default value of
w <- 1
, while element[1]
is never compared directly through iteration, the comparison ofd.y[i+w] > d.y[i:(i+w-1)]
->d.y[2] > d.y[1:(1)]
will prevent element[2]
from being selected as the peak.Also because of the way the loop is designed, the last two elements in
dens
will also not be checked:I'm not sure if there is some reason that the first element is being ignored on purpose, but in my particular case it means that the following error gets thrown when running
flowDensity
:traceback():
You can see here that .getIntersect is throwing an error because
peaks
is null, and because of how R handles comparisons to NULLTry this tiny example:
If there isn't any particular reason that the first element is being skipped, it can be fixed by replacing the if statement at
414
with the following:Furthermore, what is the purpose of the
w
parameter? It seems like it may have been intended to protect against lots of small peaks being returned in the case of having some jitter across the density distribution, but the implementation is very strange - the two ranges that it compares:and
Clips off all the values that are
< w
and> length - 2 * w
(i.e you lose twice as much data off the end ofdens
that you do off the front, was this intended to be implemented this way?