ebecht / infinityFlow

25 stars 8 forks source link

Data transformation #6

Open GeorgeAlehandro opened 2 years ago

GeorgeAlehandro commented 2 years ago

Hello Mr. Etienne, I hope you are doing well.

I have a question concerning data transformation for the input fcs files. So I read in the Q&A (https://www.biolegend.com/en-us/videos/webinar-high-dimensional-flow-cytometric-characterization-of-complex-tissues-with-infinity-flow) that the inserted channels should be the non-compensated ones and that is fine and easily done through the flowcore::decompensate() method so it's not really an issue at all. But the issue is around data transformation. As we can see the data is logicle transformed inside the infinityFlow pipeline. The data I am dealing with in my case is a already gated and logicle-transformed data (So you can imagine that I had the raw fcs files, I transformed then gated the data, then I am left with the gated cells with transformed values). First of all, how much of an impact has the "double" transformation of the data on the algorithm-inputed values? And second of all is it possible to change the behavior of the pipeline if the data has been already transformed so it doesn't run its own transformation operation over the data matrix?

Thanks a lot for your time.

ebecht commented 2 years ago

Hello George,

Thank you for your interest in this pipeline.

About compensation, I believe we typically ran the pipeline on compensated data. What I think is important is that there should be low spillover from the "exploratory channel" (PE in the BioLegend kit) into backbone channels. Otherwise I believe the models may poorly generalize to other wells for events with high PE intensity.

About data transformation, right now there is no easy way to disable it. I don't know if transforming twice would drastically affect the predictions, I think it should usually be okay-ish.

If you really need to skip the logicle transformation you can input xp = xp.Rds to infinityFlow:::standardize_backbone_data_across_wells and remove the inverseLogicleTransform part in infinityFlow:::export_data ?

I will try to add more options for transformation in a later version but right now I have very little time to work on this, sorry !

Etienne

rubylunde commented 1 year ago

Hi Etienne,

I am also running into a similar problem. I am cleaning my data in OMIQ and then exporting to pipe through to infinityFlow. I also notice a similar issue with the scaling where it seems to be transforming twice. Thus, when I import the imputed fcs files into OMIQ, only the imputed markers scaling seems to be logically transformed twice. They look smashed together around 0.

Have you been able to work on adding settings for this or do you have another recommended work around since it has been a while since the last comment in this thread?

I would appreciate any help!

Ruby

ebecht commented 1 year ago

Hi @rubylunde,

No this has not been implemented for now. These days I have very little time to work on adding new features, hopefully I'll get into a position with more freedom to do this soon. Hopefully there is a way for you to use the untransformed data. Otherwise I think it is still worth trying the pipeline even with the double transformation.

Best, Etienne