saeyslab / CytoNorm

R library to normalize cytometry data
33 stars 6 forks source link

Normalization question #38

Closed fbenedett closed 1 year ago

fbenedett commented 1 year ago

Hello,

this is not an issue with CytoNorm, I would just need to know if what I am trying to do is possible. My colleague collected several batches, from 1 to 6. Batches 1,2,3,4, and 6 have the control A that I can use for normalization. The problem is that Batch 5 has no control A. She also collected a control B for batches 5 and 6. Is there a way to normalize all the samples ? I was thinking to normalize batches 5 and 6 by control B, then normalizing every batch by control A but I get errors like:

Avis : 39322 cells (52.24%) seem far from their cluster centers.Splitting Normalized/Norm_Batch05-1E18-Live cells.fcs
Error in MapDataToCodes(fsom$map$codes, fsom_new$data) : 
  NA/NaN/Inf in foreign function call (arg 1)

Is this normalization doable or doomed?

SofieVG commented 1 year ago

Hi Fabrizio,

I think the approach you are proposing should certainly be possible, even though you might need to be extra careful for artefacts. I think you would first normalize batch 5 towards batch 6 (e.g. using the argument goal = "batch6" if that is your batch label). Then you can normalize all files and give the updated files from batch 5 also the label of batch 6.

Not sure about the error though. On the one hand, I am concerned that 50% of your cells seem far from your cluster center. This seems to indicate that the cells in your control sample do not match your other samples well? In that case, it might be difficult to do a correct normalization because your control samples might not capture the information needed. Is the NA error in the second step of the normalization, meaning the NAs would be introduced in the normalized files? I've seen this issue before but I thought it had been resolved by now... If it still occurs, I might need to look into it again.

fbenedett commented 1 year ago

Thank you for your answer.