Katsevich-Lab / sceptre

An R package for single-cell CRISPR screen data analysis emphasizing statistical rigor, massive scalability, and ease of use.
https://katsevich-lab.github.io/sceptre/
GNU General Public License v3.0
26 stars 8 forks source link

Error in eigen(Zt_wZ, symmetric = TRUE) #135

Open Yuanca9 opened 4 months ago

Yuanca9 commented 4 months ago

Hello,

I encountered the following issue while running the run_calibration_check() function:

sceptre_object <- run_calibration_check(
  sceptre_object = sceptre_object,
  parallel = FALSE
)
Note: If you are on a Mac laptop or desktop, consider setting `parallel = TRUE` to improve speed. Otherwise, keep `parallel = FALSE`.

Constructing negative control pairs. ✓
Note: Unable to generate the number of negative control pairs (991975) requested. Generating as many negative control pairs (11761) as possible.
Generating permutation resamples. ✓
Analyzing pairs containing response ENSMUSG00000053644 (1 of 11761)
Error in eigen(Zt_wZ, symmetric = TRUE) : 'x'里有无穷值或遗漏值。

Sorry about the Chinese characters in the output. It seems that some infinite values or missing values were generated while processing my discovery pairs. However, my data has already passed the QC steps and generated valid discovery pairs. Does anyone have any ideas on how to resolve this issue?

Here's my sceptre object:

An object of class sceptre_object.

Attributes of the data:
    • 67836 cells (6242 after cellwise QC)
    • 32287 responses
    • Low multiplicity-of-infection 
    • 268 targeting gRNAs (distributed across 139 targets) 
    • 1 non-targeting gRNAs 
    • 7 covariates (cell_type, grna_n_nonzero, grna_n_umis, n_egfp, response_n_nonzero, response_n_umis, response_p_mito)

Analysis status:
    ✓ import_data()
    ✓ set_analysis_parameters()
    ✓ assign_grnas()
    ✓ run_qc()
    ✗ run_calibration_check()
    ✗ run_power_check()
    ✗ run_discovery_analysis()

Analysis parameters: 
    • Discovery pairs: data frame with 4487754 pairs (991975 after pairwise QC)
    • Positive control pairs: data frame with 139 pairs (79 after pairwise QC)
    • Sidedness of test: both
    • Control group: complement set
    • Resampling mechanism: permutations
    • gRNA integration strategy: singleton
    • Resampling approximation: skew normal
    • Multiple testing adjustment: BH at level 0.1
    • N nonzero treatment cells threshold: 7
    • N nonzero control cells threshold: 7
    • Formula object: log(response_n_nonzero) + log(response_n_umis) + response_p_mito + cell_type

gRNA-to-cell assignment information:
    • Assignment method: mixture
    • Mean N cells per gRNA: 26.83
    • Mean N gRNAs per cell (MOI): 0.11 
    • gRNA assignment formula object: log(response_n_nonzero) + log(response_n_umis) + log(n_egfp + 1) + response_p_mito + cell_type + log(grna_n_nonzero + 1) + log(grna_n_umis + 1)
timothy-barry commented 4 months ago

Hi,

Thanks for the report. I don't think I've seen this before. Something is going wrong in the differential expression testing step. Might it be possible for you to send me a subset of your data so that I can reproduce this issue on my machine? If so, I'll provide instructions for how to send the data.

All the best, Tim

Yuanca9 commented 4 months ago

Hi Tim,

Thank you for your response.

Yes I can provide a subset of my data. Please let me know how to send it to you.

Best regards, Tsai

timothy-barry commented 4 months ago

Hi Tsai,

Please send me an R list (in .rds format) containing response_matrix, grna_matrix, grna_target_data_frame, moi, extra_covariates, and response_names. My email is tbarry@hsph.harvard.edu.

Tim

timothy-barry commented 3 months ago

Hi Tsai (continuing our conversation here),

Apologies for the delay. I had a chance to take a look into the bug. Thanks for reporting this.

For some reason, the regression of gene expression onto the covariates produced a fitted value of "NA" for cell type. When I removed cell type from the formula object, the code seemed to run correctly. Here is the updated analysis code:

lowmoi_example_data <- readRDS("~/Desktop/subset.rds")
sceptre_object <- import_data(
  response_matrix = lowmoi_example_data$response_matrix,
  grna_matrix = lowmoi_example_data$grna_matrix,
  extra_covariates = lowmoi_example_data$extra_covariates,
  grna_target_data_frame = lowmoi_example_data$grna_target_data_frame,
  moi = "low"
)
print(sceptre_object)

positive_control_pairs <- construct_positive_control_pairs(sceptre_object)
discovery_pairs <- construct_trans_pairs(
  sceptre_object = sceptre_object,
  positive_control_pairs = positive_control_pairs,
  pairs_to_exclude = "pc_pairs"
)

sceptre_object <- set_analysis_parameters(
  sceptre_object = sceptre_object,
  discovery_pairs = discovery_pairs,
  positive_control_pairs = positive_control_pairs,
  formula_object = formula(~ grna_n_nonzero + grna_n_umis + n_egfp + response_n_nonzero + response_n_umis)
)
print(sceptre_object)

# 3. assign grnas
plot_grna_count_distributions(sceptre_object)
sceptre_object <- sceptre_object |> assign_grnas(method = "mixture", parallel = TRUE)
plot(sceptre_object)
print(sceptre_object)
grna_assignment_matrix <- get_grna_assignments(sceptre_object)

# 4. run qc
plot_covariates(sceptre_object, p_mito_threshold = 0.075)
sceptre_object <- sceptre_object |> run_qc(p_mito_threshold = 0.075)
plot(sceptre_object)
print(sceptre_object)

# 6. run power check
sceptre_object <- run_power_check(sceptre_object)
plot(sceptre_object)

# etc

Note that you cannot run a calibration check in low MOI when there is only a single NT gRNA.

Finally, I pushed a small update to the package to check for the presence of NAs in the vector of fitted regression coefficients. Therefore, you may want to download and install the latest version of the package to obtain more informative error messages.