Closed gilgolan73 closed 2 months ago
Probably you ran out of RAM in the "Get TSS profile for fragment" step. Try to reserve more RAM (several 10-s of GB) for your job and run pycistopic QC again:
pycistopic qc \
--fragments data/fragments.tsv.gz \
--regions outs/consensus_peak_calling/consensus_regions.bed \
--tss outs/qc/tss.bed \
--output outs/qc/10x_multiome_brain
# After:
2024-08-20 12:05:59,830 207s - INFO - pycisTopic.qc:compute_qc_stats - Get TSS profile for fragments."
# You should see the following output:
2024-07-16 09:55:38,677 41s - INFO - pycisTopic.qc:compute_qc_stats - Add TSS enrichment to fragments statistics per cell barcode.
2024-07-16 09:55:38,681 41s - INFO - pycisTopic.qc:compute_qc_stats - Calculate KDE for log10 unique fragments in peaks vs TSS enrichment.
2024-07-16 09:55:41,186 44s - INFO - pycisTopic.qc:compute_qc_stats - Calculate KDE for log10 unique fragments in peaks vs fractions of fragments in peaks.
2024-07-16 09:55:43,130 45s - INFO - pycisTopic.qc:compute_qc_stats - Calculate KDE for log10 unique fragments in peaks vs duplication ratio.
2024-07-16 09:55:44,046 46s - INFO - pycisTopic.qc:compute_qc_stats - Add probability density function (PDF) values to fragments statistics per cell barcode.
2024-07-16 09:55:44,046 46s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.fragments_stats_per_cb.parquet".
2024-07-16 09:55:44,096 46s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain4.fragments_insert_size_dist.parquet".
2024-07-16 09:55:44,097 46s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.tss_norm_matrix_sample.parquet".
2024-07-16 09:55:44,098 46s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.tss_norm_matrix_per_cb.parquet".
2024-07-16 09:55:51,346 54s - INFO - pycisTopic.cli.subcommand.qc:qc - Calculating Otsu thresholds.
2024-07-16 09:55:51,356 54s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.fragments_stats_per_cb_for_otsu_thresholds.parquet".
2024-07-16 09:55:51,373 54s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.fragments_stats_per_cb_for_otsu_thresholds.tsv".
2024-07-16 09:55:51,375 54s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.cbs_for_otsu_thresholds.tsv".
2024-07-16 09:55:51,376 54s - INFO - pycisTopic.cli.subcommand.qc:qc - Writing "outs/qc/10x_multiome_brain.otsu_thresholds.tsv".
2024-07-16 09:55:51,376 54s - INFO - pycisTopic.cli.subcommand.qc:qc - pycisTopic QC finished.
Hi, I am currently using a linux virtual machine allocated with ~24gb of RAM (the maximum I can allocate). I understand that I need to buy more RAM for my computer- but how much will be enough for completing the entire scenicplus pipeline?
Thank you, Gil
The amount of memory needed depends on your dataset. With a bit of luck in the future less RAM will be needed. Polars is currently working on a streaming engine which should be able to handle bigger than memory dataframe operations.
You can try to increase the min_fragments_per_cb
parameter from 10 to a higher value, to reduce the memory usage. (Try to not increase it too much as otherwise the generate QC plots will be less interesting as all values for cell barcdes that appear less than that threshold value will not be calculated/plotted).
pycistopic qc ... --min_fragments_per_cb 20
Hi, thank you for the help. I bought more RAM (now have 128 GB) and the QC seems to be working well.
Describe the bug I try to run the tutorial for pycistopic (using 10x brain multiome data), on a centos 9 virtual machine. In the QC step, the relevant QC files are not formed, and the downstream commands don't work (error massages that files are missing). Please help me fix this issue. Thank you, Gil
To Reproduce
Error output Paste the entire output of the command, including log information prior to the error.
Expected behavior Sample level statistic graphs should appear.
Screenshots If applicable, add screenshots to help explain your problem or show the format of the input data for the command/s.
Version (please complete the following information):
Additional context the output of the previous commands in the QC section: