Open acope3 opened 1 year ago
@davbunn1 @HannahMaroof I have also added the annotations for L. waltii (branch cope-waltii-121
)
Working in branch: "cope-waltii-121" Genus folder within example-datasets: "waltii" Strain: Lachancea waltii Y-8285 (existing L. thermotolerans contaminants data on example-datasets already) Data source: EIRNA BIO in 2023 (James Keane, Darren Fenton and others) Transcriptome annotation courtesy of Alex Cope @acope3 Contaminants fasta file created from ncbi data as detailed in provenance file UMIs (N) and Barcodes (B) used:
Read structure: NNNN - rpf sequence - NNNNN - BBBBB – Adapter Barcodes: Rep1 – ATCGT, Rep2 – AGCTA, Rep3 - CGTAA Adapter sequence: AGATCGGAAGAGCACACGTCTGAA
Test run was successful - all output files created. Results not great - periodicity poorly defined, phasing not as biased as expected, read lengths on average a bit shorter than expected... See attached. read_counts_by_length.pdf frame_proportions_per_ORF.pdf metagene_start_stop_read_counts.pdf
Thanks for starting to add a new dataset to example-datasets! This issue template includes the key steps, see add-new-dataset.md. Please edit as needed for your dataset.
cheng-entamoeba-123
if the dataset were generated by Dr. Cheng, from entamoeba, and the new issue ticket is number 123.check_fasta_gff
.