kharchenkolab / Baysor

Bayesian Segmentation of Spatial Transcriptomics Data
https://kharchenkolab.github.io/Baysor/
MIT License
155 stars 31 forks source link

LoadError: pca must have at least 3 components #86

Open ankitbioinfo opened 1 year ago

ankitbioinfo commented 1 year ago

Hello developers,

Many thanks for creating Baysor for spatial transcriptomics field. I am using baysor for 2d cell segmentation for smFISH data for just few handful of genes. I get the following error when I run the Baysor.

 ~/.julia/bin/baysor run -m 5 --n-clusters=1 -s 10 --scale-std=50%   --prior-segmentation-confidence=0.5  --save-polygons --count-matrix-format mRNA_coords_raw_counting.csv segment.tif
[12:41:04] Info: Run Rcdb7acc01
[12:41:04] Info: (2023-08-01) Run Baysor v0.6.1
[12:41:04] Info: Loading data...
[12:41:05] Info: Loaded 11427 transcripts
[12:41:10] Info: Loading segmentation mask...
[12:41:10] Warning: Minimum transcript coordinates are < 1: (0, 0). Filling it with 0.
└ Baysor.DataLoading /Users/agrawal/.julia/packages/Baysor/L94P7/src/data_loading/prior_segmentation.jl:30
[12:41:11] Info: Done
[12:41:11] Info: Estimating noise level
[12:41:13] Info: Done

r{Float32, Int64}, SparseArrays.SparseVector{Float32, Int64}, SparseArrays.SparseVector{Float32, Int64}).
This might be caused by recursion over very long tuples or argument lists.
ERROR: LoadError: pca must have at least 3 components
Stacktrace:
  [1] error(s::String)
    @ Base ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
  [2] gene_composition_color_embedding(pca::Matrix{Float32}, confidence::Vector{Float64}; normalize::Bool, sample_size::Int64, seed::Int64, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Baysor.Processing ~/.julia/packages/Baysor/L94P7/src/processing/data_processing/neighborhood_composition.jl:124
  [3] gene_composition_color_embedding(pca::Matrix{Float32}, confidence::Vector{Float64})
    @ Baysor.Processing ~/.julia/packages/Baysor/L94P7/src/processing/data_processing/neighborhood_composition.jl:116
  [4] gene_composition_colors(df_spatial::DataFrames.DataFrame, k::Int64; method::Symbol, n_pcs::Int64, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Baysor.Processing ~/.julia/packages/Baysor/L94P7/src/processing/data_processing/neighborhood_composition.jl:165
  [5] gene_composition_colors
    @ ~/.julia/packages/Baysor/L94P7/src/processing/data_processing/neighborhood_composition.jl:160 [inlined]
  [6] run_segmentation(df_spatial::DataFrames.DataFrame, gene_names::Vector{String}, opts::Baysor.Utils.SegmentationOptions; plot_opts::Baysor.Utils.PlottingOptions, min_molecules_per_cell::Int64, estimate_ncvs::Bool, plot::Bool, save_polygons::Bool, run_id::String)
    @ Baysor.Processing ~/.julia/packages/Baysor/L94P7/src/processing/utils/cli_wrappers.jl:72
  [7] run(coordinates::String, prior_segmentation::String; config::Baysor.Utils.RunOptions, x_column::String, y_column::String, z_column::String, gene_column::String, min_molecules_per_cell::Int64, scale::Float64, scale_std::String, n_clusters::Int64, prior_segmentation_confidence::Float64, output::String, plot::Bool, save_polygons::String, no_ncv_estimation::Bool, count_matrix_format::String)
    @ Baysor.CommandLine ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
  [8] command_main(ARGS::Vector{String})
    @ Baysor.CommandLine ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
  [9] command_main()
    @ Baysor.CommandLine ~/.julia/packages/Comonicon/HDhA6/src/codegen/julia.jl:90
 [10] command_main(; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Baysor ~/.julia/packages/Baysor/L94P7/src/Baysor.jl:41
 [11] command_main()
    @ Baysor ~/.julia/packages/Baysor/L94P7/src/Baysor.jl:41
 [12] top-level scope
    @ ~/.julia/bin/baysor:15
 [13] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)
    @ Base ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
 [14] _include(mapexpr::Function, mod::Module, _path::String)
    @ Base ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
 [15] include(mod::Module, _path::String)
    @ Base ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
 [16] exec_options(opts::Base.JLOptions)
    @ Base ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
 [17] _start()
    @ Base ~/.julia/scratchspaces/cc9f9468-1fbe-11e9-0acf-e9460511877c/sysimg/libbaysor.dylib:-1
in expression starting at /Users/agrawal/.julia/bin/baysor:15

What could be the possible reason for this error? Thank you.

VPetukhov commented 1 year ago

Hi @ankitbioinfo , could you please provide more details on your setup? How much is "few handful of genes"? The error looks like you actually measured < 4 genes. If that's the case, the current NCV method wouldn't work properly. So, --no-ncv-estimation would be a quick fix. If you actually have this usecase with multiple molecules per cell, but very few genes, I could modify the method to process such cases correctly.