The LoReAn software is an automated annotation pipeline designed for eukaryotic genome annotation. It is built using previously defined annotation rationale and programs, but the key improvement is the incorporation of single-molecule cDNA sequencing data, such as that produced from Oxford Nanopore and from PacBio. We find this significantly improves automated annotations and reduces the requirements for time-consuming manual annotation.
We are working to improve LoReAn documentation. Meanwhile, some more LoReAn information can be found at bioRxiv (earlier version ) or Plant Physiology (peer reviewed). For those familiar with the annotation process and with docker, there should be enough information to run the program. If you have problems, please open an issue.
This is how LoReAn works: LoReAn schematic view
LoReAn requires three mandatory files:
To install the software:
Please see the installation instructions for details.
The software can be run after installing by:
lorean -pr protein.fasta -sp spacies genome.fasta
The full list of options can be found at option instructions or by:
lorean --help
LoReAn can run BRAKER2 to improve Augustus gene prediction;
To do so, short reads from RNA-seq or long reads RNA-seq need to be provided
We made available two datasets that can be used to test LoReAn. The 1st dataset is from Nanopore data of Verticillium dahliae strain JR2 while the second is from PacBio data of Plicaturopsis crispa. Both datasets can be dowloaded from LoReAn Examples