PhasorIdentifier
Analyze FLIM files (.R64, .ref) effortlessly in Google Colab. Masking, cell segmentation, pH correlation, nanoscale effects, and precise quantification. Versatile for various research scenarios.
How to cite?
Bernardi M, Cardarelli F. Phasor identifier: A cloud-based analysis of phasor-FLIM data on Python notebooks.
Biophys Rep (N Y). 2023 Nov 7;3(4):100135. doi: 10.1016/j.bpr.2023.100135.
Table of Contents
Notebook Setup
To begin using this project, follow these steps to run the code in Google Colab. Running the notebook in Google Colab is recommended as it provides access to powerful GPUs and eliminates concerns related to memory and disk space.
-
Download the Notebook:
- Click on the notebook file in the repository (typically ending with
.ipynb
extension).
- In the GitHub interface, you can click the "Download" button to save the notebook to your computer.
-
Save the Notebook to Google Drive:
- Open your Google Drive account.
- Create a new folder or use an existing one to organize your Colab notebooks.
- Upload the downloaded notebook to the selected folder.
-
Open the Notebook in Google Colab:
- Right-click on the uploaded notebook in Google Drive.
- Choose "Open with" and select "Google Colaboratory."
-
Run the Notebook:
- Once the notebook is open in Google Colab, you can run each code cell by clicking the "Play" button or using the keyboard shortcut Shift + Enter.
- Follow the instructions provided within the notebook for each section.
Note for Local Execution:
If you prefer to run the notebook locally, keep in mind the following considerations:
- Make sure the file import paths are correctly set to reflect your local directory structure.
- You might need to have Python and required dependencies installed locally.
- For optimal performance, using Google Colab is recommended due to its access to GPU resources and ease of use.
File Naming
Proper file naming is one of the most critical aspects of using this code effectively. The file names play a crucial role in understanding the experimental context and enabling cumulative analysis for replicated experiments.
Naming Convention
The naming convention for your data files should follow the format:
20230830_ Sample 1_1.R64
- Date of the experiment in the format
YYYYMMDD
.
Sample 1
: Name or identifier for the sample (no underscores allowed).
1
: Number indicating the replica of the experiment.
This naming format is essential for correctly interpreting and analyzing your data. It allows the code to associate different samples and replicas within the same experimental context.
Importance of Consistency
Consistency in naming is crucial, especially when dealing with multiple replicates of the same experiment. Cumulative analysis, which relies on matching file names, depends on the consistent use of this naming convention.
Note on Mask File Extension
If you are using a .tif
or .i64
mask file alongside your data files, make sure to maintain the same naming convention for the text before the file extension. For example:
20230830_ Sample 1_1.tif
Follows the same date, sample name, and replica number format as the data file.
Caution
- Avoid using more or fewer than two underscores in your file names, as it will disrupt the code's ability to correctly interpret and analyze the data.
- Inconsistent or incorrect naming can lead to errors in the code's execution.
By adhering to this standardized file naming convention, you ensure smooth execution of the code and accurate representation of your experimental data.
Features
-
Dataset Creation:
- Generate two datasets: one for collecting identified phasors and another containing detailed information about each data point imported from files.
-
Phasors and ROIs Visualization:
- Visualize phasor shifts to offer an intuitive understanding of variations and trends within the data.
-
Morphological Analysis and Clustering:
- Perform morphological analysis on the datasets, including clustering functionalities to group similar data points together.
-
Statistical Analysis:
- Conduct statistical analysis on sample distributions, such as lifetime, G or S-wise properties. Provide insights into the underlying data characteristics.
-
Detection of Coexisting Physical States:
- Identify molar fractions or intensity fractions of coexisting drugs' physical states within nano particles. Provide valuable insights into complex systems.
Usage
Follow these steps to effectively use the features of this project in Google Colab:
Pre-processing Operations
- Import Data:
- The code is designed to read data from
.R64
and .ref
file formats.
- Files need to be uploaded to the file section of Google Colab.
- You have the option to either manually upload the files or simply drag and drop them into the file section.
- The code will automatically process all uploaded files.
-
File Formats:
- Supported formats:
.R64
, .ref
and potentially .ifli
.
- Please note that the
.ifli
format is not advised due to its large file size.
-
Data Format and Structure:
- The uploaded
.R64
and .ref
files should each contain the following arrays:
- Intensity: Array representing the intensity values.
- 1st Harmonic Phase: Array representing the phase values of the 1st harmonic.
- 1st Harmonic Module: Array representing the module values of the 1st harmonic.
- 2nd Harmonic Phase: Array representing the phase values of the 2nd harmonic.
- 2nd Harmonic Module: Array representing the module values of the 2nd harmonic.
- Ensure that the data arrays are properly aligned and correspond to each other.
- If your data files do not match this structure, you might need to preprocess or reformat the data to fit the required arrangement.
-
Pre-Processing Operations::
- Make sure to compile all the necessary functions before proceeding with the main code execution.
- Review and verify the input parameters that will be used in the main code section.
- Ensure that the parameters are correctly set according to your specific use case and requirements.
Input Parameters
Before running the code, make sure to set the following input parameters according to your specific needs:
-
Threshold Selection:
- Set the thresholding method by commenting out the methods you're not using: Otsu, Multi-Otsu, or Custom.
-
Extent of Median Filter:
- Adjust the extent of the median filter applied to the data. Alternatively, you can use a Gaussian filter.
-
Frequency of FLIM Signal:
- Specify the frequency at which the FLIM signal was collected. The default is set to 80 MHz.
-
Minimum Pixels for Phasor Detection:
- Set the minimum number of pixels required to detect the presence of a phasor.
-
Minimum Pixels for ROI Detection:
- Specify the minimum number of pixels required to define the presence of a Region of Interest (ROI).
-
ROI Contour Levels and Lower Frequency Levels:
- Define the number of contour levels used to detect an ROI (default: 8 levels).
- Set the number of lower frequency levels to discard (default: 3 levels).
-
Signal Percentile for Contour Upper Limit:
- Adjust the signal percentile used to set an upper boundary for the contour plot (default: 95th percentile).
-
Type of Analysis:
- Choose the type of analysis: "Cumulative" or "Single File."
- For cumulative analysis, the code uses the input file name.
-
Cell Segmentation:
- By default, set to
off
. Set to on
if you want to use the Cellpose algorithm for cell segmentation.
- To enable cell segmentation, use the notebook cell labeled "Cell Segmentation"
-
Reference Points:
- By default, set to
off
. Set to on
if you want to plot reference points on the phasor plot.
- To enable reference points, use the notebook cell labeled "Insert Refplot"
-
File extension:
- By default set to
.R64
, can change to .ref
-
Tau mode:
- By default set to
phase
, can change to module
By carefully configuring these input parameters, you can tailor the analysis to your data and research objectives. Make sure to review and adjust these settings before running the code for accurate and meaningful results.
Dataset Acquisition
To effectively utilize the provided codebase, adhere to the subsequent instructions to execute the analysis in your local environment. These steps are structured to ensure a smooth and accurate analysis process.
Prerequisite Operations
-
Pre-processing and File Naming:
- Prior to initiating the dataset acquisition process, confirm that you've completed all necessary pre-processing operations on your data.
- Verify that the file naming conventions adhere to the required standards for the code's proper functioning.
-
Local Execution Adjustments:
- If your intention is to run the code locally on your machine, you must modify a specific variable within the code.
-
Locate the path
variable in the code and replace the existing 'content' path with the desired custom path.
# Change this line
path = 'content'
# To something like this
path = '/path/to/your/data/directory'
Code Functionality Overview
The code is engineered to identify regions of interest through two distinct methods: contour plots and phasor analysis. This process culminates in the generation of images and two pivotal dataframes: df
and df_dataset
.
index |
Sample |
G |
G std |
S |
S std |
lifetime |
lifetime std |
sampled points |
Contour PCA variability ratio |
ContourIdx |
phasor |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
df_dataset
: Within this dataframe, you will find intricate details about each individual datapoint encapsulated within the analyzed pixels.
Sample |
Pixel |
G |
S |
lifetime |
Intensity |
ROI |
phasor |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Morphological Analysis and Clustering
In this section, you can visualize Regions of Interest (ROIs) and perform morphological analysis on the data. The provided functionality allows you to select specific ROIs based on the sample name and ROI index.
Visualizing ROIs
-
Select Sample and ROI:
- Using the dropdown menu, choose a sample name and ROI index to visualize the corresponding region.
- Note: ROI 0 represents the group of data discarded from any ROI.
-
Intensity and Lifetime Maps:
- The analysis generates intensity and lifetime maps, highlighting the distribution of the selected ROI.
-
Cluster Analysis:
- You have the option to perform spatial analysis by clustering. Input the desired number of clusters in the dropdown menu.
- If the cluster parameter is set to a number below 2, the analysis uses 2 clusters; otherwise, it uses the specified input.
The morphological analysis section returns the following maps:
- Intensity Map: Visualizes the intensity distribution within the selected ROI.
- Lifetime Map: Displays the lifetime distribution within the selected ROI.
- Clustering Map: Offers spatial insight through clustering analysis.
Use these maps to gain a deeper understanding of the morphological characteristics of the selected ROIs and explore spatial relationships within your data.
Statistical Analysis of ROIs
The Statistical Analysis of ROIs section provides a comprehensive tool for comparing variables between two selected ROIs within a sample. To begin the analysis, follow these steps:
-
Select Sample and ROIs:
- Use the dropdown menu to choose the sample name and two ROI entries for comparison.
-
Choose Variable:
- In the same dropdown menu, pick a variable (G, S, or lifetime) to analyze and compare between the two ROIs.
-
Generated Images:
- The analysis generates three insightful images for comparison:
- Variable Distributions: Illustrates the distribution of the chosen variable in both ROIs.
- Cumulative Distributions: Visualizes the cumulative distribution of the variable in each ROI.
- Boxplot (and Violinplot) Visualization: Presents a boxplot comparison of the variable in the two ROIs. The option to view a violinplot is also available for enhanced visualization.
Regarding Statistical Tests:
- The code employs the Kolmogorov-Smirnov and Mann-Whitney tests with Bonferroni correction on the distributions fitted to a Gaussian Kernel Density Estimation (KDE).
- To ensure robustness, the statistical tests are performed on automatically downsampled data to avoid test failures due to excessive statistics.
In addition, the code provides the following statistical metrics for both distributions:
- 25th Percentile
- 50th Percentile (Median)
- 75th Percentile
- Central Tendency
- Spread
- Skewness
Intensity and Molar Fraction Analysis
In this analysis, the df
dataframe of phasors is expanded to match the anticipated number of physical states. For instance, if we're considering three physical states (e.g., free-in-solution, membrane-bound, crystal drug), three additional columns are appended to df
, each representing a state.
Cellular Metabolism Analysis
- NADH Ratio Calculation:
- The
nadh_ratio
function is used to calculate the fraction of NADH that is in a bound and free state for each phasor data point stored within the df_common_filtered
dataframe.
- This computation is made possible through the
semicircle_intersection
function, which is employed to determine the position of bound NADH.
- The
semicircle_intersection
function takes as input free NADH (0.37 ns) and the phasor coordinates of each phasor to perform this calculation.
- If needed, the same method can be applied to analyze the fractional intensity of FAD (Flavin Adenine Dinucleotide) by simply substituting the 0.37 ns value with the appropriate lifetime value for free FAD.
- NADH analysis visualization:
- The intersection points, representing free and bound NADH, can be visualized on the phasor plot.
- Additionally, a distribution of free and bound NADH can be presented using a violin plot, allowing for a graphical representation of the data.
- Furthermore, the implementation includes a mapping of metabolism, indicating a comprehensive approach to metabolic analysis.
Literature Key References