milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
335 stars 79 forks source link

Preset for long-read RNAseq with cell barcode #1682

Closed Euchiz closed 5 months ago

Euchiz commented 5 months ago

Discussed in https://github.com/milaboratory/mixcr/discussions/1681

Originally posted by **Euchiz** May 31, 2024 Hi everyone, I am currently working with single-cell Oxford Nanopore RNAseq data that contains both a 10x cell barcode and a UMI at the beginning of each read. Unfortunately, there seem to be no existing presets (or could anyone point out which should I use) that cater to this specific combination of single-cell barcoding and long-read sequencing parameters. While I have reviewed the existing long-read RNAseq presets from the website, they do not include options for handling single-cell barcodes. I attempted to modify the YAML configuration files to create a custom preset that combines the 10x single-cell workflow with long-read sequencing parameters. I have reviewed a previous discussion on a related topic ([Discussion #1024](https://github.com/milaboratory/mixcr/discussions/1024)), but the provided guidance was based on outdated YAML configurations. Could you please provide assistance or guidance on creating a custom preset for this specific use case? FYI here is the read structure I am working with: ``` "^(CELL:N{16})(UMI:N{12})(R1:*)" ``` Any help with the correct configuration or a new preset would be greatly appreciated. By the way, I have already used ([wf-single-cell](https://github.com/epi2me-labs/wf-single-cell)) to refine my nanopore reads. I have the cell tags CB and molecule tags UB in my result bam file. Does it mean I can remove the refinement step and use the result bam file from wf-single-cell directly for alignment and assembling? Thank you for your support and for developing such a powerful tool!
mizraelson commented 5 months ago

Answered in #1681