whitews / FlowUtils

FlowUtils is a Python package containing various utility functions related to flow cytometry analysis, primarily focused on compensation and transformation tasks commonly used within the flow community.
https://flowutils.readthedocs.io
BSD 3-Clause "New" or "Revised" License
14 stars 8 forks source link

Transformation differs from R flowCore logicle Transform #9

Closed JcGKitten closed 2 years ago

JcGKitten commented 2 years ago

Hi,
I wanted to use the flowutils logicle transformation in stead of the one provided from flowCore. In both cases i use the default parameters (m=4.5 , t=26144 ,w=0.5, a=0). But the results differ. I expected the same output cause it uses the same transformation function.

Python Code:

flow_data = flowio.FlowData(str(file_path), ignore_offset_error=True)
columns = [channels[key]["PnN"] for key in sorted(channels, key=int)]
events = numpy.reshape(flow_data.events, (-1, flow_data.channel_count))

markers_to_transform = [marker["PnN"] for marker in channels.values() if not(re.match(r"^time$", marker["PnN"], re.IGNORECASE))]

fluoro_indices = []
for channel in flow_data.channels:
    if flow_data.channels[channel]['PnN'] in markers_to_transform:
        fluoro_indices.append(int(channel) - 1)
fluoro_indices.sort()

transformed_events = flowutils.transforms.logicle(events, fluoro_indices)

df = pandas.DataFrame(transformed_events, columns=columns)

Result(df.head(5)):

transformed_py

R Code:

library(flowCore)
path <- "/home/max/Cytolytics/Samples/samples/raw/Sample_C-066C_006.fcs"
flowframe <- read.FCS(path)

tf  <- transformList(cols[1:19], logicleTransform(), transformationId = "logicle")
out_frame <- transform(flowframe, tf)

Results:

transformed_r

Is there something I'm missing or why is the scaling different.

Thanks and best wishes Max

whitews commented 2 years ago

Hi Max,

FlowIO returns the events exactly as they are saved in the FCS file. This does not take into account the channel gain values typically present in the FCS metadata. For analysis, these gain factors should be applied prior to any pre-processing (such as transformation).

The FlowKit Sample class will do this automatically for you. Can you install FlowKit and try applying the transform to see if you get equivalent results?

The FlowKit code to do this would look something like:

import flowkit as fk

# create Sample from file path
fcs_path = 'path/to/fcs/file/example.fcs'
sample = fk.Sample(fcs_path)

# define a transform
xform = fk.transforms.LogicleTransform(
    'some_xform_id', 
    param_t=262144, 
    param_w=0.5, 
    param_m=4.5, 
    param_a=0
)

sample.apply_transform(xform)

xform_events = sample.get_events(source='xform')

By default, the apply_transform method will apply the given transform to only the fluorescent channels. The scatter channels are usually not transformed usin a "bi-ex" style transform, but are sometimes linearly transformed.

You can find more info in the FlowKit docs about this, but for now let's see if the fluoro channel values are reasonably close to flowCore using this above code.

-Scott

JcGKitten commented 2 years ago

Hey Scott, thanks for the fast reply. I used your code above and got the following result: flowkit_transform

As you can see, the events for the fluorescent channels are the same as with the flowio + flowutils approach. I'm not sure if the flowCore function applies something else before the transformation, but the input events are the same in R and Python. Till now i wasn't able to find out which code flowCore uses for the bi-ex transformation and didn't find a graph of this transformation. I thought a graph would show which value the events should have after the transformation.

whitews commented 2 years ago

Ok, we can use the LUT from the GatingML 2.0 specification. I've taken a screenshot of those input / output values and pasted below for reference, but we can test the first set of logicle parameters. What does the flowCore logicle function output for the following input values & the parameter values T = 1000, W = 1, M = 4 and A = 0 ?

Input values: −10 −5 −1 0 0.3 1 3 10 100 1000

For parameter values T = 1000, W = 1, M = 4 and A = 0, the output should be:

≈ 0.067574 ≈ 0.147986 ≈ 0.228752 0.25 ≈ 0.256384 ≈ 0.271248 ≈ 0.312897 ≈ 0.432426 ≈ 0.739548 1

image

whitews commented 2 years ago

Max,

I see what flowCore is doing! They are giving you the results in the number of decades. Since you specify 4.5 as the number of decades in your example, if you multiple the FlowKit values by 4.5 you get the same as flowCore. Just my opinion, but they should be more transparent in their documentation about that.

Well, at least that mystery is solved!

-Scott

JcGKitten commented 2 years ago

Hey Scott, thank you very much, I would have needed much more time to find that. Yes the documentation of flowCore isn't exactly great ... I hope I didn't steal to much from your time with that bug request, but maybe others will wonder about it as well and have the explanation documented here. Best wishes and a have a great weekend Max

SamGG commented 2 years ago

If you feel flowCore's documentation needs to be improved, don't hesitate to provide a PR. Best.

JcGKitten commented 2 years ago

@SamGG sry I didn't mean to affront the authors, I can imagine how hard it is to write documentation, especially when it's such a scientific topic. Next time I try to be more precise about what could be improved or even provide a pull request.