Closed jbloom closed 4 years ago
In the notebook process_ccs.ipynb
, the schemes of the amplicon constructs appears to have some bugs -- perhaps this is past where you did the troubleshooting?
PacBio_amplicons.gb
file and they appear correct there@tylernstarr, thanks for catching these. The redundant README lines are now removed in e42d90d.
As far as the site mis-labeling in the images, this is actually a bug in the dna_features_viewer
. They are not actually wrong, the tick labels are just rendered wrong. I've submitted a pull request to dna_features_viewer
(see here) to fix that, so once they merge that request we can fix the numbering in the images.
I'm pretty sure the lengths are correct? It's just that the labeling is not very clear. The labels are above the images, so GD-Pangolin is actually the second one and HKU3-1 is actually the third one: and GD-Pangolin is longer than HKU3-1 as expected when you notice this. I agree the titles are not ideally located and the title is missing for the last one, but I think they are all correct just badly formatted.
If OK with you, I'd suggest with merge this even with the problematic image formatting, and then when the numbering is fixed by my dna_features_viewer
pull request, I can work on re-formatting the titles too. But it should not matter for actual analyses.
The
./data/PacBio_amplicons.gb
file now contains all the different potential amplicons with appropriate names.The
process_ccs.ipynb
reads in this full set of amplicons as potential targets.The
./data/README.md
has been updated to better describe this and the other input data.Note: the full pipeline is not yet set up to handle multiple amplicons, so will break somewhere midway through
process_ccs.ipynb
.