ShouWenWang-Lab / snakemake_DARLIN

6 stars 4 forks source link

Enable max_cells and max_molecules functionality in snakemake_DARLIN pipelines? #14

Open jzussman opened 3 weeks ago

jzussman commented 3 weeks ago

Hi ShouWen, I'm wondering whether there is a simple way to add max_cells or max_molecules to the config files used for the MATLAB or python-based DARLIN pipelines. Tuning the read thresholds is useful of course, but the additional confidence of grounding the UMI and/or CB denoising in the actual number of cells/barcode sequences available would be helpful. For single-cell experiments, would the same thing be achieved by generating a custom config that uses the cell barcodes associated with high-quality cells rather than all possible barcodes? Thank you very much!

ShouWenWang commented 3 weeks ago

We did these kinds of quality control, not in config, but in downstream analysis, in the final cell-allele table.

―― Shou-Wen Wang, PhD Principal Investigator School of Life Sciences | School of Sciences Westlake University Shilongshan ST #18, Xihu, Hangzhou, Zhejiang https://www.shouwenwang-lab.com/


From: jzussman @.> Sent: Saturday, November 2, 2024 6:59:37 AM To: ShouWenWang-Lab/snakemake_DARLIN @.> Cc: Subscribed @.***> Subject: [ShouWenWang-Lab/snakemake_DARLIN] Enable max_cells and max_molecules functionality in snakemake_DARLIN pipelines? (Issue #14)

Hi ShouWen, I'm wondering whether there is a simple way to add max_cells or max_molecules to the config files used for the MATLAB or python-based DARLIN pipelines. Tuning the read thresholds is useful of course, but the additional confidence of grounding the UMI and/or CB denoising in the actual number of cells/barcode sequences available would be helpful. For single-cell experiments, would the same thing be achieved by generating a custom config that uses the cell barcodes associated with high-quality cells rather than all possible barcodes? Thank you very much!

― Reply to this email directly, view it on GitHubhttps://github.com/ShouWenWang-Lab/snakemake_DARLIN/issues/14, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABDCASVGNKJPNI7SRTBR7PLZ6QBVTAVCNFSM6AAAAABRBDAVOKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGYZDSOJZGQ4TQNA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

jzussman commented 3 weeks ago

For single-cell analyses, that makes sense. What about including a max_molecules functionality for Bulk experiments in the snakemake_DARLIN pipelines, either the MATLAB or python versions?