Programm call and Python/R script
Input: fasta file with the sequences
Output: csv file with the sequence identifier and the percentage of the sequence, that is low complexity. The program dustmasker should be used for this (score threshold for sub windows set to 15). It writes the low complexity regions in lower case in the fasta file. The number of lower case letters has to be divided by the length of the sequence. This should be done with a R or python function.
Dustmasker can be installed using conda
Programm call and Python/R script Input: fasta file with the sequences Output: csv file with the sequence identifier and the percentage of the sequence, that is low complexity. The program dustmasker should be used for this (score threshold for sub windows set to 15). It writes the low complexity regions in lower case in the fasta file. The number of lower case letters has to be divided by the length of the sequence. This should be done with a R or python function. Dustmasker can be installed using conda
Source Paper: HuntMi: an efficient and taxon-specific approach in pre-miRNA identification