dpeerlab / Palantir

Single cell trajectory detection
https://palantir.readthedocs.io
GNU General Public License v2.0
213 stars 50 forks source link

ValueError: Bin edges must be unique: #56

Closed brianpenghe closed 3 years ago

brianpenghe commented 3 years ago

I was ran these codes and got error:

sc.pp.normalize_per_cell(C2)
palantir.preprocess.log_transform(C2)
sc.pp.highly_variable_genes(C2, n_top_genes=1000, flavor='cell_ranger')

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-30-8125ec15df52> in <module>
----> 1 sc.pp.highly_variable_genes(C2, n_top_genes=1000, flavor='cell_ranger')

/opt/conda/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py in highly_variable_genes(adata, layer, n_top_genes, min_disp, max_disp, min_mean, max_mean, span, n_bins, flavor, subset, inplace, batch_key)
    424 
    425     if batch_key is None:
--> 426         df = _highly_variable_genes_single_batch(
    427             adata,
    428             layer=layer,

/opt/conda/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py in _highly_variable_genes_single_batch(adata, layer, min_disp, max_disp, min_mean, max_mean, n_top_genes, n_bins, flavor)
    242         from statsmodels import robust
    243 
--> 244         df['mean_bin'] = pd.cut(
    245             df['means'],
    246             np.r_[-np.inf, np.percentile(df['means'], np.arange(10, 105, 5)), np.inf],

/opt/conda/lib/python3.8/site-packages/pandas/core/reshape/tile.py in cut(x, bins, right, labels, retbins, precision, include_lowest, duplicates, ordered)
    271             raise ValueError("bins must increase monotonically.")
    272 
--> 273     fac, bins = _bins_to_cuts(
    274         x,
    275         bins,

/opt/conda/lib/python3.8/site-packages/pandas/core/reshape/tile.py in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates, ordered)
    397     if len(unique_bins) < len(bins) and len(bins) != 2:
    398         if duplicates == "raise":
--> 399             raise ValueError(
    400                 f"Bin edges must be unique: {repr(bins)}.\n"
    401                 f"You can drop duplicate edges by setting the 'duplicates' kwarg"

ValueError: Bin edges must be unique: array([          -inf, 1.00000000e-12, 1.00000000e-12, 5.40105948e-04,
       1.00438703e-03, 1.97046941e-03, 3.51440884e-03, 6.46164417e-03,
       1.23855204e-02, 2.44226694e-02, 5.05443168e-02, 9.71072304e-02,
       1.71948684e-01, 2.66962457e-01, 3.84024115e-01, 5.42443170e-01,
       7.61077264e-01, 1.11170306e+00, 1.84908313e+00, 9.83359179e+00,
                  inf]).
You can drop duplicate edges by setting the 'duplicates' kwarg
brianpenghe commented 3 years ago

I found a solution: just filter out genes that are not expressed.