We present FISHnCHIPs for highly sensitive in situ profiling of cell types and gene expression programs. FISHnCHIPs achieves this by simultaneously imaging ∼2-35 co-expressed genes that are spatially co-localized in tissues, resulting in similar spatial information as single-gene FISH, but at ∼2-20-fold higher sensitivity. See https://www.nature.com/articles/s41467-024-46669-y. This software guides users to design and evaluate their own gene panel using input from scRNA-seq. We also provide the FISHnCHIPs data analysis pipeline that we used to process the raw FISHnCHIPs data and cluster the module-cell matrix to define cell types.
A computer that can run Python and/or R, with at least 16 GB of RAM. No non-standard hardware is required.
For Gene Panel Design:
For Image Processing:
Packages common to both gene panel design and data processing:
For Gene Panel Design:
For Image Processing:
For manuscript figures (R packages):
The tutorial for the workflow of gene panel design and evaluation are provided as a jupyter notebook file Gene panel design tutorial.ipynb
in the FISHnCHIPs_GenePanelDesign_Tutorial folder.
In the tutorial, various functions were called from the FISHnCHIPS_0.1.0.py
package in the scripts folder.
get_panel
(used for both cell-centric and gene-centric panel design)
Function returns a DataFrame containing the genes selected for the panel based on the hyperparameters, the correlation of each gene with the reference marker gene and the cluster that it belongs to.
get_filtered_genes
(only for gene-centric panel design)
Function for prefiltering the genes with low gene expression level, specific naming patterns, minimum and maximum number of cells that gene is present in to ensure that they are adequately expressed genes.
Returns the list of genes that passes the filtering requirements.
leiden_corr_matrix
(only for gene-centric panel design)
As the reference marker file is not available for gene-centric panel design, the clustering of genes will have to be conducted using algorithms based on the gene-gene correlation.
Function uses the leiden algorithm to cluster the genes and removes genes that have correlation lower than the threshold with all other genes, returning a tuple containing a dataframe of the cluster that each selected gene belongs to and the cluster network graph of the selected genes.
get_cumulative_signals
(only for cell-centric panel evaluation)
Function calculates the signal gain, conserved signal gain and Signal Specificity Ratio (SSR) of each gene using the gene-cell expression matrix with genes selected during the panel design and returning the results as a DataFrame.
get_cell_bit_matrix
(only for gene-centric panel evaluation)
Function returns a DataFrame that contains the expression level of each cluster of genes in each cell (cell-bit matrix).
evaluate_gene_centric_panel
(only for gene-centric panel evaluation)
Function returns a DataFrame containing the signal gain for each gene and the cumulative signal gain for genes in the same cluster. The signal threshold hyperparameter was used for differentiating signal cells from background noise.
The tutorial for the analysis of FISHnCHIPs image data is provided as a jupyter notebook file FISHnCHIPs data analysis tutorial.ipynb
in the FISHnCHIPs_DataAnalysis_Tutorial folder.
In the tutorial, various functions were called from the FISHnCHIPS_DataAnalysis.py
package (To be packaged, containing FISHnCHIPsImages.py, registerFunction.py and segmentationFunction.py) in the scripts folder.
FISHnCHIPsImages.py
segment_dapi_one_fov
Function saves a DAPI image with overlapping border removed (TIF format), dilated cell masks image (TIF format) and an overlay of cell masks onto the DAPI image (JPG format) in the output path folder.
subtract_one_image
Function returns the bleach-subtracted Hyb images in TIF format.
segment_one_fov
An all-in-one function that performs the functions of segment_dapi_one_fov
and subtract_one_image
when all DAPI, Hyb and bleached Hyb images are ready. It also returns list of all cell mask for spatial and mask intensity analysis.
segmentationFunctions.py
get_centroids
Function returns the coordinates of the cell masks of each FOV as a DataFrame.
get_mask_positions_inFuse
Function returns the coordinates of the cell masks in the context where all FOVs are fused as a DataFrame.
get_mask_intensity_matrix
Function takes in the list of all cell masks from the segment_one_fov
function as input and returns 4 separate DataFrames with the mean, median, maximum and summation of mask intensity of each cell for each Hyb.
get_mask_info
Function takes in the list of all cell masks from the segment_one_fov
function as input and returns various information of the cell masks including list of FOVs analysed, cell types identified, mask intensity, area of masks and mask spatial positions as a DataFrame.
get_mask_positions_inFOV
Function takes in the cell mask information from the get_mask_info
function as input and returns a tuple of fov, x-coordinate and y-coordinate of each cell position in their respective FOV which can be fed into the get_mask_positions_inFuse
function to obtain the spatial position of cell masks in the context where all FOVs are fused.
Figures 3 and 4 are produced using R data visualizaion packages while Figure 5 and 6 are produced with Python visualization tools. For Figure 5 and 6, the packages used are the same as the tutorial, hence simply run the jupyter notebook to reproduce the figures. For Figure 3 and 4, functions from the standalone package capFISHImage
were used. Please install the package provided in the package folder using the following command and run the script provided accordingly:
install.packages('./package/capFISHImage_0.1.0.zip', repos=NULL, type='source')
An explanation of each of the parameters used in Gene panel design tutorial.ipynb
and FISHnCHIPs data analysis tutorial.ipynb
.
The raw scRNA-seq count matrix data was preprocessed using the Seurat pipeline to produce the cell-scaled gene-cell matrix used in this tutorial. The clustermap shown in Figure 3c-j were clustered using functions from the Seurat package as well.
Hao and Hao et al. Integrated analysis of multimodal single-cell data. Cell (2021) [Seurat V4]
The function provided in the tutorials uses the following packages that are included in the standard Anaconda distribution:
See full license here.