xjtu-omics / msisensor-pro

Microsatellite Instability (MSI) detection using high-throughput sequencing data.
Other
93 stars 20 forks source link

discriminative microsatellite site (DMS) from paper #32

Open syan1 opened 2 years ago

syan1 commented 2 years ago

This module scans the reference genome to get microsatellites information. You need to input (-d) a reference file (.fa or .fasta), and you will get a microsatellites file (-o) for following analysis. If you use GRCh38.d1.vd1 , you can download the file on out github directly.

@PengJia6 Do you have this file somewhere in the project? I cannot seem to locate it. I assume the file referred to in the wiki is the ~7700 sites DMS from the paper? Thank you.

PengJia6 commented 2 years ago

You can download from the supplementary information of the paper.

syan1 commented 2 years ago

Thank you but I cannot build a microsatellite file (scan output) from that table without some more information.

What does the "_binary" columns indicate and how should I replicate that from the sites in your supplementary table?

Thank you so much for your help!

PengJia6 commented 2 years ago

Hi, it is binary code of the sequence.

Here, we encode A, C, G, and T with binary number 00, 01, 10, and 11. If you have sequence GT, it should be code with 1011 (binary), its decimal value is 11.

mital14 commented 1 year ago

how can we download msi reference list file?