Align and Call Consensus for Curation

I am trying to curate RepeatModeler output using the guidelines in this paper (https://currentprotocols.onlinelibrary.wiley.com/doi/full/10.1002/cpz1.154). In particular, I am focused on curating LINE elements in a newly assembled corn snake genome to ensure that the consensus is fully extended to the bounds of the TE (we are looking at recombination events in snakes, and snakes have had pretty recent activity in their LINE elements, so it is important that we identify LINE regions in our genome as accurately as possible).

So far, I have run Repeat Modeler on the corn snake genome and ran RepeatMasker with those results. I am trying curation on the LINE element that has the most hits in the RepeatMasker run. I have a file with the consensus sequence of this family generated by RepeatModeler (head of file): >ltr-1_family-4#LINE/CR1 [ Type=LTR, Final Multiple Alignment Size = 5 ] ATGCTTTCAGTCTGCTGAGCTACCAGGCCTGTTGCCACAAAGAAGAGCGG GTCAACTTATTTTCCAAACCACCAGAAGGGCAGGCCATGAAACAATGGAT GGATGGAAACTAATTGAGGAGAGAAGCAACCTGGAATTAAGGAGAAACTT CCTAACAGTGAGGACAATTTACCAGTGGAACGGCTTGCCATGAGNAGNTG TGGGCGCTCCATCACTGGAGGCTTTTAAGAAGAGACTGGACAGCCNCCCG TCTGAAACGGTACAGGNTCTCCTGCTTGAGCGGGGGGCTGGACTAGAAGA CCTCCAAGGTCCCTTCCGNCTCGTCCATTCTGTANCACACACGCACCCCC ACAGATGGCCCGGAGTTAATAAGCCACACTACAAAACTCTTTGAAGATAA AGCTAGCAGCAACCCAGTGGCTAGCTGCCAATTCAGACTTTACTCACACA

and a file with all instances of this family from the RepeatMasker output (head of file): >Super_scaffold_1:24374-24588 ctgcatttggactaatccttgtattgcggaaactttgcctgctttatcggaatgcttgcagtctaatctttgttttgtgtgagtaaagtctgaattggcagctagccactgggttgctgccagctttatcttcaaagattttgtcgtgtggcttattaactctgggccgtctgtgggggtgcgtgtgtgggacggggacaaaacaggtccttggg >Super_scaffold_1:27356-27776 catgatggcgaacctatggcatgcgtgccacaggtggcacgcggagccatatcagtaggcacgcaagctcagctctggcacacatgcgcgcaccagccagctgattttcaggcctttcaggcccactggaagtcggcaaacaggctatttccggccttcggagagcctctagggagctggagaaggtcattttcgccctccccaggctcctagaaaggctctggagcctggggagggcgaaaaacgggcctaccggggccaccatgccatcgcgtgccaaaagtggggggagtgcagggggggcggtcacgcacacatgcacggggtgcattgaattatgggtgtgggcacacacccaagcgaccccgctgcgctcctcccgcttttggcacgtgatggcaaaaaggttagccatcactgt

I ran alignAndCallConsensus.pl -c family4_con.fa -e family4_elements.fa -int in the TETools singularity container, and the program is taking over an hour to generate a suggested refinement for the first iteration. When I ran this program with the example files provided in the paper, everything worked perfectly.

So to my questions... 1) Am I understanding the necessity of curation correctly here? 2) Why might alignAndCallConsensus be taking so long on my LINE family? 3) Are there any special characteristics of LINE families that I might be missing that are relevant here?

I am quite new to TE biology and identification, so I hope I am not missing anything obvious here!

Thank you in advance!

Dfam-consortium / RepeatModeler

Align and Call Consensus for Curation #165