bvaldebenitom / SoloTE

GNU General Public License v3.0
23 stars 6 forks source link

Parameters for generating inputs for SoloTE #44

Closed hgiles5 closed 2 weeks ago

hgiles5 commented 1 month ago

Hi,

Thanks for all your work generating SoloTE. I'm wondering about some specifics for generating the input bam file. Could you post an example of the parameters you used with STAR to generate the bam files prior to analysis with SoloTE? In addition to the three parameters for multimapping you mentioned in the paper, do you recommend using STARsolo? I'm asking in the context of scRNAseq or GEX reads from scMultiome with data generated on the 10X platform.

Thank you!

bvaldebenitom commented 1 month ago

Hi @hgiles5 !

Thanks for the kind words.

Here are the recommended options to use in STAR:

--outSAMattributes NH HI nM AS CR UR CY UY CB UB GX GN sS sQ sM
 --winAnchorMultimapNmax 100
--outFilterMultimapNmax 100
--outSAMmultNmax 1

Yes, I recommend using STARsolo, in particular considering that recent versions of CellRanger by default assign intronic reads to genes, and they can often be associated with TEs.

I haven't tested SoloTE with scMultiome data. By any chance are you re-analyzing public data? If so, you can send me the accession or link to it so I can test the pipeline. I will verify too if there are public multiome datasets in the 10X website.

hgiles5 commented 1 month ago

Thanks. Could you provide an example of an entire STAR command you would use? Unfortunately I can't share my dataset, but there is a public scMultiome PBMC dataset from 10X at https://support.10xgenomics.com/single-cell-multiome-atac-gex/datasets/1.0.0/pbmc_granulocyte_sorted_10k

bvaldebenitom commented 1 month ago

Thanks @hgiles5. I will check the scMultiome PBMC data and let you know!

Regarding the STAR command, it would change depending on your input. Would you be doing a realignment from BAM or FastQ files?

hgiles5 commented 1 month ago

An alignment from the fastq files

bvaldebenitom commented 1 month ago

This is how the entire command would look:

STAR
 --outSAMattributes NH HI nM AS CR UR CY UY CB UB GX GN sS sQ sM
 --outSAMtype BAM SortedByCoordinate
 --outFilterMultimapNmax 100
 --winAnchorMultimapNmax 100
 --outMultimapperOrder Random
 --outSAMmultNmax 1
 --runRNGseed 777
 --runThreadN 21
 --soloType CB_UMI_Simple
 --soloCBwhitelist BARCODES_WHITELIST
 --readFilesIn FASTQ2 FASTQ1
 --genomeDir STAR_GENOMEDIR

You need to modify BARCODES_WHITELIST, FASTQ2 FASTQ1 and STAR_GENOMEDIR accordingly. Hope this helps!

github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open for 10 days with no activity.

github-actions[bot] commented 2 weeks ago

This issue was closed because it has been inactive for 14 days since being marked as stale.