cortes-ciriano-lab / SComatic

A tool for detecting somatic variants in single cell data
Other
173 stars 28 forks source link

chromosome name difference, e.g., chr1 vs 1 #27

Closed robinycfang closed 1 year ago

robinycfang commented 1 year ago

Hi,

My scRNA-seq data were aligned using fasta file without "chr", so it looks like this:

D00353:300:HNLF3BCX2:2:2116:12760:71863 256 1 11201 0 98M * 0 0 GCTTGCTCACGGTGCTGTGCCAGGGCGCCCCCTGCTGGCGACTAGGGCAACTGCAGGGCTCTCTTGCTTAGACTGGTGGCCAGCGCCCCCTGCTGGCG DCCDCIIIIHIHI@HHEHEEHHFHGHHHCDECHEHGIIDDFIHIIHHGIHIIHHIIIHHHHGHIFIIIIIIHIHHEHGHHHHHF?EHHDHDHHHEFH? NH:i:5 HI:i:2 AS:i:94 nM:i:1 RE:A:I li:i:0 BC:Z:CTGCAAGC QT:Z:DDDDDIII CR:Z:TATCTCAAGATGCGAC CY:Z:DDDDDGHIIIIIIIHI CB:Z:TATCTCAAGATGCGAC-1 UR:Z:ATTCTACATG UY:Z:IGIIIIIIHI UB:Z:ATTCTACATG RG:Z:T_100070:0:1:HNLF3BCX2:2

However, I am worried that due to the difference in "chr", when you apply files of PoN and editting sites which have "chr" in the names, these filters might actully not work as the CHROM is different. In your program, does it convert between differenct chromosome names? If the name doesn't automatically convert in the program, it would be good to add such feature, as realign the reads to another version of fasta would be time-consuming.

Thanks.

Francesc-Muyas commented 1 year ago

Dear user, Thanks for bringing up this topic. For now, the tool does not recognise if the chromosomes have prefix or not. So you have to use the Hg38 with prefixes in the chromosomes. We might include this update in the future.

In any case, we are preparing (and soon updating) the PoN and Editing files without the "chr" prefix to allow the users to use the one they prefer.

Cheers, Fran

Francesc-Muyas commented 1 year ago

These files have been added to the PoNs folder.

Thanks, Fran