GWW / scsnv

scSNV Mapping tool for 10X Single Cell Data
MIT License
22 stars 4 forks source link

question about the output file #10

Closed earn97 closed 2 years ago

earn97 commented 2 years ago

Hi. I have some question about the output file.

I got final output file (snv_barcode_bases.txt.gz, snv_base_counts.txt.gz, snv_edges.txt.gz, snv_map.txt.gz, snv_reads.txt.gz). I need the point mutation's ref, alt so I want to use snv_edges.txt.gz file.


snv_idx_1   snv_idx_2   chrom   pos_1   pos_2   ref_1   alt_1   ref_2   alt_2   strand  RR  AA  RA  AR
3   4   chr1    944295  944306  G   A   T   C   -   0   93  1   0
3   5   chr1    944295  944311  G   A   G   A   -   0   0   93  0
4   5   chr1    944306  944311  T   C   G   A   -   1   1   117 0
6   7   chr1    958250  958338  A   G   G   A   -   0   24  0   0
6   8   chr1    958250  958447  A   G   C   T   -   0   0   7   0
7   8   chr1    958338  958447  G   A   C   T   -   0   1   36  0
9   10  chr1    965591  965642  T   G   T   C   +   0   6   0   0
13  14  chr1    1014273 1014529 A   G   G   A   +   0   0   17  0
13  15  chr1    1014273 1014534 A   G   G   A   +   0   1   11  0
14  15  chr1    1014529 1014534 G   A   G   A   +   85  0   1   2
21  22  chr1    1280259 1280336 C   T   G   A   +   0   6   0   0
24  25  chr1    1318755 1319055 G   A   A   G   -   0   2   0   0

But I can't understand all columns, especially why pos_1, pos_2 is together in same row and they are appears repeatedly. Also i want to know the meaning of RR,AA,RA,AR columns.

Please explain about the output file's columns in detail.

GWW commented 2 years ago

Hi.

This file is for SNV co-expression analysis. That file provides data on how many collapsed reads support two different SNVs.. The RR, AA, RA, AR columns are the molecule counts for reads supporting the RR (ref/ref), AA (alt/alt), etc. genotypes.

The file you would want for your analyses would be use the annotate command. This would give you files with all of the SNV calling information you would like.