caseywdunn / sharkmer

3 stars 0 forks source link

in silico rad seq #5

Open caseywdunn opened 7 months ago

caseywdunn commented 7 months ago

Working on is in branch rad.

Need to decide on output format. Corresponds to step 5 output of ipyrad - https://ipyrad.readthedocs.io/en/latest/7-outline.html#consensus-base-calling-and-filtering

caseywdunn commented 7 months ago

From https://ipyrad.readthedocs.io/en/latest/tutorial_intro_cli.html:

"You can see that all loci within each sample have been reduced to one consensus sequence. Heterozygous sites are represented by IUPAC ambiguity codes (find the K in sequence 1A_0_1), and all other sites are homozygous."

So for each combination of start and end kmer, I should export a single sequence where snp bubbles have been collapsed down to ambiguous nucleotides.