labgm / EasySSR

1 stars 0 forks source link

EasySSR: an User-Friendly web application for large-scale batch microsatellite prediction and samples comparison

Read the Paper: https://www.frontiersin.org/articles/10.3389/fgene.2023.1228552/full

Web tool available at: https://computationalbiology.ufpa.br/easyssr/

Tutorials and guides: https://github.com/engbiopct/EasySSR/

Introduction

Microsatellites

Microsatellites, also known as Simple Sequence Repeats (SSRs) or Short Tandem Repeats (STRs), are polymorphic DNA regions with tandem repetitions of a nucleotide motif ranging 1 - 6 bp, also called mononucleotide, di-, tri-, tetra-, penta- and hexanucleotide repeats (Pinheiro 2022). They can be categorized into perfect, imperfect and compound, and are found in both coding and non-coding regions in eukaryotes, prokaryotes and viruses (Mudunuri 2007, Beier 2017). The SSRs have various clinical implications and a broad range of applications in many fields, such as conservation and evolutionary studies, comparative genomics, molecular biology, biotechnology, oncology, and forensics (Laskar 2022, Pinheiro 2022).

How to use the web server: Quick Tutorial

Input

Write a ProjectName and input your fasta files. Optional: email, genbank files (must have the same name as correspondent fasta files). Use the default parameters. Click in the Upload & Run button. The parameters default are: - Minimum Repeat Number: Mono-12, Di-6, Tri-4, Tetra-3, Penta-3, and Hexa-3; - Imperfection maximum: 10%. -Two sample datasets are available for download. computationalbiology ufpa br_easyssr_ (3)

Execution

You are free to invest your time in something useful while waiting for your results. image

Outputs

Done!

How to use the web server: Detailed Input tutorial

User Information

image

Input:

E.g: someone with 'Alves' project name. image

Input files

image

**If you choose to analyze coding/non coding regions, a genbank annotation file will be solicited.

Default Parameters

The default parameters are based on Pinheiro (2022):

If a user desires to run the default parameters but change something, they should use the custom parameters as done in the following figure. image

Custom parameters

A guide help text can be acessed for each parameter by clicking in the (i) button. image

How to run in Misa-mode

FAQ

Input file not working

Problems with fasta file

Please verify:

Problems with genbank file

In case you selected the analysis of coding/non coding regions, might have problems with genbank file please verify:

Output: Issues with graphs

Graph for Imperfect SSR empty:

EasySSR is stuck in Step1, Step2 or Step3.

The time of execution depends on the size, complexity and amount of genomes queued, as also on the computational disponibility of the server where the data is processed. It is normal to take seconds, minutes or hours depending on the input. We recommend to perform a test with EasySSR using a a single genome (faster execution) to check if the tool is working properly. If the tools is working properly, it's adviseable to wait until the analysis is done or until an error message appears. Updating your page will stop your processing, and you would have to restart from data upload.