KrishnaswamyLab / SAUCIE

Other
98 stars 29 forks source link

Preprocessing of input FCS files #22

Closed ashipde closed 4 years ago

ashipde commented 4 years ago

The FCS files of my CyTOF data have variables (columns) of different types

I want to use SAUCIE for both batch correction and clustering. For clustering, I want to use only the cell lineage markers.

  1. Should I preprocess the FCS files to remove any column that is not pertinent for clustering (such as for 'Time' and cell status markers)?
  2. Should the data be processed in any other way, such as log2-transformation or arcsinh transformation? Raw data values in the files range from 0 to >15,000.

Thanks.

mattamodio commented 4 years ago

Hi, thanks for using SAUCIE! If you don't want to use certain columns, you can leave those column indexes out of the file cols_to_use.txt, and SAUCIE will ignore them. If you want to batch correct them but not cluster them, you can run it twice with different values in the cols_to_use.txt file. As for pre-processing, SAUCIE.py performs an asinh transformation as is typical for CyTOF data, so you don't need to do that ahead-of-time if you use SAUCIE.py.