How do I... - Githubissues

Hi,

I do have a subcommand for scsnvmisc that will convert the annotated pileup h5 file to a reference and alternative counts in market matrix format and a vcf file. I don't use R very much but I am sure there are libraries that can parse the vcf and market matrix files.

The pileup annotate tool also writes an annotated text file with information about each SNV (pileup_passed_snvs.txt.gz). It does require a tab separated chromsome lengths text file for the vcf header. You can generate this file from a samtools faidx indexed file:

samtools faidx genome.fa
cut -f 1,2,3 genome.fa.fai > chrom_lengths.txt

For example, this will write all sites that do not overlap annotated RNA edits:

scsnvmisc snv2vcfmtx -r chrom_lenghts.txt -f genome.fa -o output_folder -e -c pileup_annotated.h5

This will produce:

output_folder/barcodes.txt #list of barcodes
output_folder/snvs.vcf #basic SNV vcf file
output_folder/refs.mtx #Reference count market matrix file
output_folder/alts.mtx #Alternative count market matrix file

Unfortunately, I have not done much work clustering mutations.

GWW / scsnv

How do I... #21