Open ArtPoon opened 3 years ago
Some unit tests/synthetic data started in 9eeaf51
Eventually I think we'll need to simulate data for a comparison of pipelines against some ground truth
Proposal:
seqtk subseq
to extract from sequences.fasta (doesn't work with sequences.fasta.xz, must unzip first).data/get-sequences-with-pangolineage.sh
, takes over an hour to run on Rei because of unzipping/zipping).A computationally faster method might be to calculate all mutations from sequences_pangolin (encode_diffs) ahead of time, sample amplicon regions, simulate coverage, then reconstruct the sequence within coverage regions from the reference.
Unit test needed for #46
Please focus on minimap2.py
and estimate-freqs.R
@SandeepThokala thanks
@SandeepThokala to post coverage of unit tests
We need 'em