The attached file contains an HTML page with a prototype of the genome table design. The table contains columns displaying
The sample name
The haplotype name
The haplotype abundance within that sample
The genotype/strain name
A link to the GenBank record for that genotype
The annotated sequence of the haplotype
This sequence is color-coded by base, and highlights any variant positions in each haplotype sequence. It also scrolls sideways
This graph should go front-and-center on the home page of The Visualizer.
More Info
Context
Dr. Palinski wanted easy to read genome calls. It took me a while to figure out the best place to put them. Bill and Dana like pretty graphs. They couldn't really tell me what the graphs look like, so I guessed and came up with this. It is very information-dense and should please everyone.
Possible implementation
To pull this off, we will need to:
Convert all haplotype YAMLs into haplotype fastas, while maintaining frequency data
haplotyping:HAPLINK_FASTA currently converts, while SIMULATED_READS:HAPLOTYPE_DEPTH calculates depth from single-haplotype YAML files. This will need to be rethought
Concatenate the following for all samples:
Haplotype fastas + frequencies
Consensus sequences
Perform alignment of each sequence to the reference genome of params.genome
Each of these aligned sequences needs to be exactly the same length, so a multi-alignment using MAFFT might be the best option
In the case of multi-alignment, conversion into a metadata-rich like Nexus might be useful for maintaining frequency data
Take every one of those sequences, associate it back to its sample, and print it to an HTML table
We could make this on-the-fly in Node.js, but I think it would be far better to create the table on pipeline run, then <iframe> or include it in The Visualizer statically.
Summary
Place a table on the front page of The Visualizer to tell which genotype/strain each sample's consensus sequence and haplotypes BLAST toward
Added Features
Additional processes
This feature may need to be implemented in Python/R/Julia within a new process block. It should not require any new tools, however.
Additional visualizer section
seq graph.zip
The attached file contains an HTML page with a prototype of the genome table design. The table contains columns displaying
This graph should go front-and-center on the home page of The Visualizer.
More Info
Context
Dr. Palinski wanted easy to read genome calls. It took me a while to figure out the best place to put them. Bill and Dana like pretty graphs. They couldn't really tell me what the graphs look like, so I guessed and came up with this. It is very information-dense and should please everyone.
Possible implementation
To pull this off, we will need to:
haplotyping:HAPLINK_FASTA
currently converts, whileSIMULATED_READS:HAPLOTYPE_DEPTH
calculates depth from single-haplotype YAML files. This will need to be rethoughtparams.genome
<iframe>
or include it in The Visualizer statically.