Open peipp410 opened 3 weeks ago
The snakemake_DARLIN has two ways to analyze the CARLIN data. 1, the matlab way. It relies on the matlaib CARLIN pipeline. To modify the reference sequence, you need to change it here: https://github.com/ShouWenWang-Lab/Custom_CARLIN/tree/main/%40CARLIN_def https://github.com/ShouWenWang-Lab/Custom_CARLIN/tree/main/cfg This change is big and challenging.
2, there is also a python based method. This method, however, does not call mutations in the sequence like the matlab way. It only identifies the DARLIN sequence, and use difference sequences to call different clones. We implemented a single-cell version. But changing it to bulk should be also easy. https://github.com/ShouWenWang-Lab/snakemake_DARLIN/blob/master/QC/single_cell_DARLIN-10x.ipynb
-- Shou-Wen Wang, Ph.D. Principal Investigator School of Life Sciences | School of Science Westlake University Shilongshan ST #18, Xihu, Hangzhou, Zhejiang https://www.shouwenwang-lab.com/
From: Jiazheng Pei @.> Date: Friday, June 14, 2024 at 10:45 AM To: ShouWenWang-Lab/snakemake_DARLIN @.> Cc: Subscribed @.***> Subject: [ShouWenWang-Lab/snakemake_DARLIN] Does the python pipeline include sequence alignment and allele calling functions? (Issue #8)
Hi! Thanks for providing a python-based version of CARLIN pipeline. We want to modify the arguments to run on our own sequencing data and reference from a different experiment protocol. However, I can't locate where the function for sequence alignment is. It seems that the MosaicLineage directly inputs the matlab object from the original CARLIN pipeline? Thanks!
— Reply to this email directly, view it on GitHubhttps://github.com/ShouWenWang-Lab/snakemake_DARLIN/issues/8, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABDCASTPASIB6GM7V5D6ZK3ZHJKLTAVCNFSM6AAAAABJJQET3OVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TEMZXGMYDEMI. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you! I'll try that.
Hello, I encountered another problem. Is this process suitable for running large files? I have a double-end sequencing file, each about 2GB. When running the script using a 64-core server, the following error still occurs.
I have not encountered this problem before. But the preprocessing could be resource intensive. You could subset your data into several parts, process each separately, and merge the final results together.
―― Shou-Wen Wang, PhD Principal Investigator School of Life Sciences | School of Sciences Westlake University Shilongshan ST #18, Xihu, Hangzhou, Zhejiang https://www.shouwenwang-lab.com/
From: Jiazheng Pei @.> Sent: Friday, June 28, 2024 6:12:29 PM To: ShouWenWang-Lab/snakemake_DARLIN @.> Cc: Shouwen WANG 王寿文 @.>; Comment @.> Subject: Re: [ShouWenWang-Lab/snakemake_DARLIN] Does the python pipeline include sequence alignment and allele calling functions? (Issue #8)
Hello, I encountered another problem. Is this process suitable for running large files? I have a double-end sequencing file, each about 2GB. When running the script using a 64-core server, the following error still occurs. 2024-06-28.18.08.17.png (view on web)https://github.com/ShouWenWang-Lab/snakemake_DARLIN/assets/59289157/a3845fb0-901c-4106-ad71-50734ccb898d
― Reply to this email directly, view it on GitHubhttps://github.com/ShouWenWang-Lab/snakemake_DARLIN/issues/8#issuecomment-2196568954, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABDCASUNW3ADQSDLY5QJWO3ZJUZI3AVCNFSM6AAAAABJJQET3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJWGU3DQOJVGQ. You are receiving this because you commented.Message ID: @.***>
OK! Thank you!
Would it be possible for you to provide a bulk version of the python method for allele calling? Alternatively, if you don't have the bandwidth at the moment, could you instruct me on how I could most easily adapt the existing python pipeline to bulk sequence data?
In addition, would adapting the existing python-based single-cell pipeline to the newer 10xV4 chemistry be as simple as adding in the 10xV4 barcode list to /reference? What else would be required to complete this configuration? I have successfully generated this config using the MATLAB CARLIN pipeline version, but am not sure how I would do so in the python version.
Hi! Thanks for providing a python-based version of CARLIN pipeline. We want to modify the arguments to run on our own sequencing data and reference from a different experiment protocol. However, I can't locate where the function for sequence alignment is. It seems that the MosaicLineage directly inputs the matlab object from the original CARLIN pipeline? Thanks!