Open spond opened 8 years ago
2016-11-16 update : qfilt, bealign, and tn93 have been added to the test toolshed hosted by the core galaxy team at PSU. The tools currently reside in the veg fork of the tools-iuc repo, and they are installed on local instances of galaxy in the VEG group. All planemo tests pass, and bioext's python packaging bug (an issue where users installing through pip would come across "header not found" errors) has been resolved.
@stevenweaver top!!!
@bgruening this progress wouldn't have been possible without @davebx
That what makes Galaxy so great - community! @davebx thanks a bunch! And let me know if you need help with the pending conda package.
2/20 Meeting Notes @nekrut @davebx
NEXT MEETING FRI @ 3 (March 2)
@nekrut : here's the paper on PRIMER ID processing.http://jvi.asm.org/content/89/16/8540.full.pdf+html
Per my estimation, bealign is IUC ready, TN93 and qfilt need a few best-practices tweaks.
Examples of HIV processing pipeline
An example paper is Gianella et al. 2016
Another example, which now uses full length genome data is by Zanini et al. 2015.
The basic workflow
Meeting 5/30
Meeting 6/6
Meeting june 20:
Meeting july 18:
Following our discussion with @nekrut , I am creating an issue to outline essential components for creating a barebones HIV (or other short RNA virus) genomic analysis pipeline.
Key steps for amplicon data (a very preliminary outline, many steps need to be fleshed out, initial focus is on wrapping some existing tools that we have worked with in the past)
Initial QC on NGS data
(If present in run) random tag (PRIMER ID) processing
Map to reference (amplicon data), FASTQ => BAM
Filter errors (amplicon data). Need more options here
Basic post processing based on TN93
Basic phylogenetics (need to pull out some code from https://github.com/veg/HIV-NGS)
HIV clustering
A few references to existing (published) pipelines
WRAP DATAMONKEY INTO GALAXY