icbi-lab / infercnvpy

Infer copy number variation (CNV) from scRNA-seq data. Plays nicely with Scanpy.
https://infercnvpy.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
138 stars 27 forks source link

How to adjust the scale bar of chromosome_heatmap #38

Closed hyjforesight closed 2 years ago

hyjforesight commented 2 years ago

Hello inferCNVpy, It seems vmin and vmax don't work for cnv.pl.chromosome_heatmap(). Appreciate it if you could tell us How to adjust the scale bar of chromosome_heatmap. Thanks! Best, YJ

cnv.pl.chromosome_heatmap(adata, groupby="batch", dendrogram=False, figsize=(20, 10), vmin=-0.25, vmax=0.25, save='_inferCNVpy_ACT_ACI_WI.svg')
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_27724/3683230475.py in <module>
----> 1 cnv.pl.chromosome_heatmap(adata, groupby="batch", dendrogram=False, figsize=(20, 10), vmin=-0.25, vmax=0.25, save='_inferCNVpy_ACT_ACI_WI.pdf')

~\anaconda3\envs\HYJ_py38\lib\site-packages\infercnvpy\pl\_chromosome_heatmap.py in chromosome_heatmap(adata, groupby, use_rep, cmap, figsize, show, save, **kwargs)
     71     var_group_positions = list(zip(chr_pos, chr_pos[1:] + [tmp_adata.shape[1]]))
     72 
---> 73     return_ax_dic = sc.pl.heatmap(
     74         tmp_adata,
     75         var_names=tmp_adata.var.index.values,

~\anaconda3\envs\HYJ_py38\lib\site-packages\scanpy\plotting\_anndata.py in heatmap(adata, var_names, groupby, use_raw, log, num_categories, dendrogram, gene_symbols, var_group_positions, var_group_labels, var_group_rotation, layer, standard_scale, swap_axes, show_gene_labels, show, save, figsize, vmin, vmax, vcenter, norm, **kwds)
   1119 
   1120     colorbar_width = 0.2
-> 1121     norm = check_colornorm(vmin, vmax, vcenter, norm)
   1122 
   1123     if not swap_axes:

~\anaconda3\envs\HYJ_py38\lib\site-packages\scanpy\plotting\_utils.py in check_colornorm(vmin, vmax, vcenter, norm)
   1199     if norm is not None:
   1200         if (vmin is not None) or (vmax is not None) or (vcenter is not None):
-> 1201             raise ValueError('Passing both norm and vmin/vmax/vcenter is not allowed.')
   1202     else:
   1203         if vcenter is not None:

ValueError: Passing both norm and vmin/vmax/vcenter is not allowed.
cnv.pl.chromosome_heatmap(adata, groupby="batch", dendrogram=False, figsize=(20, 10), vmin=-0.25, vmax=0.25, norm=None, save='_inferCNVpy_ACT_ACI_WI.pdf')
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_27724/1442302865.py in <module>
----> 1 cnv.pl.chromosome_heatmap(adata, groupby="batch", dendrogram=False, figsize=(20, 10), vmin=-0.25, vmax=0.25, norm=None, save='_inferCNVpy_ACT_ACI_WI.pdf')

~\anaconda3\envs\HYJ_py38\lib\site-packages\infercnvpy\pl\_chromosome_heatmap.py in chromosome_heatmap(adata, groupby, use_rep, cmap, figsize, show, save, **kwargs)
     71     var_group_positions = list(zip(chr_pos, chr_pos[1:] + [tmp_adata.shape[1]]))
     72 
---> 73     return_ax_dic = sc.pl.heatmap(
     74         tmp_adata,
     75         var_names=tmp_adata.var.index.values,

TypeError: heatmap() got multiple values for keyword argument 'norm'
grst commented 2 years ago

Thanks for reporting this. The issue is that I manually specify a TwoSlopeNorm to make sure the colorbar is centered: https://github.com/icbi-lab/infercnvpy/blob/1884397fa5ef927bbe0ca8342b91f13f5fd11afe/infercnvpy/pl/_chromosome_heatmap.py#L69

Unfortunately, this parameter can currently not be overriden. I should fix that, so please let's keep the issue open until I find time (or feel free to send a PR).

As a quick & dirty workaround, you can use np.clip on adata.obsm["X_cnv"]:

# didn't test it, but i'd expect this to work like this: 
adata.obsm["X_cnv"] = np.clip(adata.obsm["X_cnv"], -0.25, 0.25)
hyjforesight commented 2 years ago

Hello @grst Thanks for the response. Please check the error below. The reason why I want to change the scale bar is that I want to make the red and blue colors look more obvious, which now looks very weak. The readers cannot tell the difference between the 2 groups.

image

In the tutorial, red and blue colors are very obvious. image

Best, YJ

adata.obsm["X_cnv"] = np.clip(adata.obsm["X_cnv"], -0.25, 0.25)
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_21184/386795464.py in <module>
----> 1 adata.obsm["X_cnv"] = np.clip(adata.obsm["X_cnv"], -0.25, 0.25)

<__array_function__ internals> in clip(*args, **kwargs)

~\anaconda3\envs\HYJ_py38\lib\site-packages\numpy\core\fromnumeric.py in clip(a, a_min, a_max, out, **kwargs)
   2113 
   2114     """
-> 2115     return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs)
   2116 
   2117 

~\anaconda3\envs\HYJ_py38\lib\site-packages\numpy\core\fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
     52     bound = getattr(obj, method, None)
     53     if bound is None:
---> 54         return _wrapit(obj, method, *args, **kwds)
     55 
     56     try:

~\anaconda3\envs\HYJ_py38\lib\site-packages\numpy\core\fromnumeric.py in _wrapit(obj, method, *args, **kwds)
     41     except AttributeError:
     42         wrap = None
---> 43     result = getattr(asarray(obj), method)(*args, **kwds)
     44     if wrap:
     45         if not isinstance(result, mu.ndarray):

~\anaconda3\envs\HYJ_py38\lib\site-packages\numpy\core\_methods.py in _clip(a, min, max, out, casting, **kwargs)
    157             um.maximum, a, min, out=out, casting=casting, **kwargs)
    158     else:
--> 159         return _clip_dep_invoke_with_casting(
    160             um.clip, a, min, max, out=out, casting=casting, **kwargs)
    161 

~\anaconda3\envs\HYJ_py38\lib\site-packages\numpy\core\_methods.py in _clip_dep_invoke_with_casting(ufunc, out, casting, *args, **kwargs)
    111     # try to deal with broken casting rules
    112     try:
--> 113         return ufunc(*args, out=out, **kwargs)
    114     except _exceptions._UFuncOutputCastingError as e:
    115         # Numpy 1.17.0, 2019-02-24

~\anaconda3\envs\HYJ_py38\lib\site-packages\scipy\sparse\base.py in __bool__(self)
    281             return self.nnz != 0
    282         else:
--> 283             raise ValueError("The truth value of an array with more than one "
    284                              "element is ambiguous. Use a.any() or a.all().")
    285     __nonzero__ = __bool__

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
grst commented 2 years ago

looks like clip doesn't work on sparse matrices.

Can you try

adata.obsm["X_cnv"].data = np.clip(adata.obsm["X_cnv"].data, -0.25, 0.25)

?

Apart from that, I really don't see any real pattern in your example. Are you sure to expect one? Maybe something else went wrong, like swapped sample labels or errors during preprocessing?

hyjforesight commented 2 years ago

Hello @grst Thanks for the response. adata.obsm["X_cnv"].data = np.clip(adata.obsm["X_cnv"].data, -0.25, 0.25) works well and I got a pattern like this. image

Does it look like a false positive? We've no idea whether we should get a pattern or not, because we did a gene knock out on mice and we don't know whether this gene can induce CNV. So we use your package, trying to address this question.

I paste all my codes here. Not sure whether I did something wrong. Appreciate it if you could help us to check. Because our aim is to compare whether these 2 samples have the same pattern of CNV, so no reference was included.

adata = sc.read('C:/Users/Park_Lab/Documents/ACTWT_ACTKO_adata_sub.h5ad')
AnnData object with n_obs × n_vars = 25246 × 5000
    obs: 'predicted_doublets', 'batch', 'leiden', 'n_genes', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_rpl', 'pct_counts_rpl', 'total_counts_rps', 'pct_counts_rps', 'cell_type'
    var: 'gene_ids', 'feature_types', 'n_cells', 'mt', 'rpl', 'rps', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
    uns: 'batch_colors', 'cell_type_colors', 'hvg', 'leiden', 'leiden_colors', 'neighbors', 'pca', 'rank_genes_groups', 'umap'
    obsm: 'X_pca', 'X_umap'
    varm: 'PCs'
    obsp: 'connectivities', 'distances'
cnv.io.genomic_position_from_gtf(gtf_file='C:/Users/Park_Lab/Documents/genes.gtf', adata=adata, gtf_gene_id='gene_name', inplace=True)
adata.var.loc[:, ["chromosome", "start", "end"]].head()
cnv.tl.infercnv(adata, reference_key='batch', window_size = 100, step = 10, exclude_chromosomes=('chrMT', 'chrX', 'chrY'), n_jobs=16)
adata.obsm["X_cnv"].data = np.clip(adata.obsm["X_cnv"].data, -0.25, 0.25)
cnv.pl.chromosome_heatmap(adata, groupby="batch", dendrogram=False, figsize=(10, 4), save='_inferCNVpy_ACT_ACI_WI.svg')
cnv.tl.pca(adata, svd_solver='arpack')
cnv.pp.neighbors(adata, n_neighbors=15, n_pcs=50, knn=True)
cnv.tl.leiden(adata, resolution=0.5)
cnv.tl.umap(adata)
cnv.tl.cnv_score(adata)
cnv.pl.umap(adata[adata.obs['batch']=='ACTWT'], color="cnv_leiden", use_raw=True, frameon=False, legend_loc='right margin')
cnv.pl.umap(adata[adata.obs['batch']=='ACTKO'], color="cnv_leiden", use_raw=True, frameon=False, legend_loc='right margin')

image image

In another case, we included wild-type tissue as a reference for calculating CNV and we got heatmap like below. Does it mean that tumor has no CNV? Why wild-type also has red and blue colors?

adata = ACT.concatenate(ACI, WI, batch_categories=['ACT', 'ACI', 'WI'], index_unique=None)    # WI is wild type. ACT is tumor.
cnv.io.genomic_position_from_gtf(gtf_file='C:/Users/Park_Lab/Documents/genes.gtf', adata=adata, gtf_gene_id='gene_name', inplace=True)
cnv.tl.infercnv(adata, reference_key='batch', reference_cat='WI', n_jobs=16)
cnv.pl.chromosome_heatmap(adata, groupby="batch", dendrogram=True, save='_inferCNVpy_ACT_ACI_WI.svg')

image

Thanks! Best, YJ

grst commented 2 years ago

is adata.X normalized and log-transformed?

Other than that nothing seems wrong with your code.

Does it mean that tumor has no CNV? Why wild-type also has red and blue colors?

some random background noise is to be expected, since the CNV estimates are essentially just smoothed gene expression by genomic position, and the gene expression data is inherently noisy. See https://icbi-lab.github.io/infercnvpy/infercnv.html#computation-steps for a more detailed description of the method.

hyjforesight commented 2 years ago

Hello @grst Yes, the input data was normalized and log-transformed by Scanpy, following your instruction (https://icbi-lab.github.io/infercnvpy/infercnv.html#computation-steps). Thanks!