ppreshant / flow_cytometry

analyzing .fcs2.0/3.0 files with semi-automated gating and plotting
GNU General Public License v3.0
1 stars 0 forks source link

Description

The scripts are divided into two modules using python and R for each of them

Density gating, MEFL transformation (python)

The python script analyze_fcs_flowcal.py and jupyter notebook scripts_archive/flowcal_pipeline_report.py/ipnb is a wrapper for (semi-)automated processing of flow cytometry data using FlowCal submodule from Tabor lab : https://github.com/taborlab/FlowCal/ in a standard workflow

Castillo-Hair, Sebastian M., et al. "FlowCal: a user-friendly, open source software tool for automatically converting flow cytometry data from arbitrary to calibrated units." ACS synthetic biology 5.7 (2016): 774-780.

Briefly, this is what the python wrapper does :

Note: I save the data from flowcal for analysis by R later. Users can use any other tool they wish. The reason for this decision is that I wasn't satisfied with the analysis and plotting capabilities provided by FlowCal and I prefer ggplot to python's plots. + R has a very good general purpose flowcytometry ecocystem with many packages built upon the flowCore package; These work on .fcs files without keeping them in the RAM!

Gating, counts, plotting distributions (R)

The R section is not fully automated yet, but it should work pretty well once you get a hang of the R commands in an hour or two. Do reach out to me by using the issues section on github if you have questions

How to run

First time setup

  1. Setup git on your computer if you haven't already - git helper
  2. Please clone this R-python hybrid code into your computer with the command git clone https://github.com/ppreshant/flow_cytometry.git or the ssh version git clone git@github.com:ppreshant/flow_cytometry.git (which is more secure, and takes a couple mins extra setting up, but I would recommend it - here's some help).
    • The same folder will hold your flow cytometry data and the outputs so it can get large. Choose the folder location accordingly.
  3. For the first time, run the steps in R to to load all the required packages install.packages('tidyverse') ; and do the same for -

    • reticulate
    • BiocManager

    Use BiocManager to install the bioconductor packages - BiocManager::install("flowCore") ; and others -

    • ggcyto
    • openCyto
  4. use conda to setup the python requirements : Mostly need the standard pandas, matplotlib, numpy etc.
    1. Install miniconda : a minimal version of the package and environment manager conda. use instructions from the documentation page
    2. Use the command conda env create -f flowcal_wrappers_environment.yaml. This will create an environment with the name flowcal and install all the python dependancies listed in the file to your conda environment

Data, and config

  1. Put your data into the flowcyt_data directory.
  2. Update the files for user_inputs for both python and R:
    1. ./0.5-user_inputs.R : for R steps
      1. base_directory <- 'flowcyt_data' or 'processed_data'
      2. folder_name <- '..' : the folder your individual .fcs files are in within the base_directory
      3. file.name_input <- '..' : Use this option if you have a single .fcs file holding multiple data (such as from Guava machines). _After unpacking these data you will use the same name for the folder_name option
      4. template_source <- 'googlesheet' # use 'googlesheet' or 'excel' options depending on where you are providing the plate layout to name the wells.
    2. scripts_general_fns/g10_user_config.py : for python steps
      1. fcs_experiment_folder = '..' : the folder your individual .fcs files are in within the base_directory
      2. density_gating_fraction = .5 ; might need to adjust
  3. Put sample names into the excel file flowcyt_data/plate_layoyts.xlsx or a google sheet. Each well with sample will have the format plasmid1_positive. The value after the '_' is the sample_category : used to colour plots ; and the value before is assay_variable will be on the x/y-axis of the plots.
    • excel option is easier but if you would prefer to use the googlesheet for naming the samples, then duplicate the Flow cytometry layouts tab from this sheet into your own googlesheet, and put its url in the 0-general_functions_fcs.R/sheeturls for the plate_layouts_pk option.

python module : density gating, MEFL

  1. open a suitable terminal that works for conda and activate the flowcal environment that you created above with conda activate flowcal
  2. launch your favorite IDE to access python. jupyter-lab should be installed in this environment, so type it's name in the same terminal and a browser window will open
  3. Follow instructions in the [[#Data, and config]] above and, add your directory name etc. to the config file scripts_general_fns/g10_user_config.py
  4. Open the jupyter notebook flowcal_pipeline_report.ipnb and execute the two cells and your data should be ready in about 3 min! .. to be elaborated

R : gating and visualizations

  1. Ensure that the data is in the folder and config file specific to R : ./0.5-user_inputs.R is updated
  2. run source('./analyze_fcs.R') to load the data into R
  3. run 7-exploratory_data_view.R for saving overview of all data.
  4. run 11-manual_gating_workflow.R for gating and saving counts of populations above the gated thresholds

Do contact me if you have any questions about running this by creating an issue here

Copyleft : GPL-3.0-or-later license

wrappers for automated processing and plotting of bacterial flow cytometry data 
Copyright (C) 2023  Prashant Kalvapalle

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.