riboviz / example-datasets

Example datasets to run with RiboViz
Apache License 2.0
2 stars 7 forks source link

Add new dataset Waltii #121

Open acope3 opened 1 year ago

acope3 commented 1 year ago

Thanks for starting to add a new dataset to example-datasets! This issue template includes the key steps, see add-new-dataset.md. Please edit as needed for your dataset.

acope3 commented 1 year ago

@davbunn1 @HannahMaroof I have also added the annotations for L. waltii (branch cope-waltii-121)

davbunn1 commented 1 year ago

Working in branch: "cope-waltii-121" Genus folder within example-datasets: "waltii" Strain: Lachancea waltii Y-8285 (existing L. thermotolerans contaminants data on example-datasets already) Data source: EIRNA BIO in 2023 (James Keane, Darren Fenton and others) Transcriptome annotation courtesy of Alex Cope @acope3 Contaminants fasta file created from ncbi data as detailed in provenance file UMIs (N) and Barcodes (B) used:

Read structure: NNNN - rpf sequence - NNNNN - BBBBB – Adapter Barcodes: Rep1 – ATCGT, Rep2 – AGCTA, Rep3 - CGTAA Adapter sequence: AGATCGGAAGAGCACACGTCTGAA

davbunn1 commented 1 year ago

Test run was successful - all output files created. Results not great - periodicity poorly defined, phasing not as biased as expected, read lengths on average a bit shorter than expected... See attached. read_counts_by_length.pdf frame_proportions_per_ORF.pdf metagene_start_stop_read_counts.pdf