Closed kguynes closed 2 weeks ago
Hi,
This looks like there might be an issue with the FASTA headers that faswap.py
can't deal with. In this particular case, it seems there are some non-utf8
characters being detected. What do the headers look like? are there any weird characters in them?
Dear @TobyBaril,
Thank you for your response. I've copied an example of the header below. Could it be the space and the length parameter included in the header?
>Cang_2012_03_13_00002 length=637461
TAATATTGAATTATAGCTGTCCACGTATTTATGGCGCACCCTGTAATGTGGTGATTGCAA
TACCAAATGTTGAAAATACAGTATGATAGATTTGAAATGTCAAATCCATTGTAATTAGAA
GATTGAGAATGATAGATTGTGGGAAGAGCTTCCCCTTATTTAATATGTTATGATTTATGT
TCAAAAACAAAAAGTACAAAACTACAAAACGATATGATGTTCAGCCTTTTTTCCACCAAA
AATATATAATGCAAAAATTATAGCGGTATCAAGCTTAAGGTATTTGAATAAAAATCAGAA
CTGGTTGTGAAAAGTACTATTTCTGAGGGTGAGTACAAGAATCAAACAAGTCTATTTTTC
TGTAATTCTCTGGAGATTGTTAATTTAGATGCAAAATATTATTCTAAAAGAATTTCATTT
As per your suggestion, I have amended the fasta header and the pipeline seems to run just fine. Will report if anything else goes wrong. Many thanks for your help!
Hi,
Many thanks for creating a wonderful tool. Unfortunately, I have attempted to run this pipeline several times using the genomic fasta file I acquired from https://parasite.wormbase.org/Caenorhabditis_angaria_prjna51225/Info/Index but to no avail.
I'll copy the error messages here for clarity:
I ran the
RepeatModeler
tool to build database and compute TE annotation to test if this is indeed an issue with the fasta file, but it seems to run without an issue.Not sure how to mitigate the issue I'm running into with the EarlGrey tool. Any help/pointers would be greatly appreciated.
Thank you very much in advance!