Closed isgilman closed 5 years ago
This shouldn't have anything to do with your upgrading from Dfam 2.0 to 3.0 -- although I will address that further at the end. Refiner is the component of RepeatModeler that is handed pre-clustered instances of a single TE family and is responsible for aligning and refining that alignment. At the end it should call a consensus from the final alignment. It does not use RepeatMasker or any of it's libraries to do this task. I think the problem precedes this step. Something that is really glaring is that in the log you posted you reached Round #4 of RepeatModeler and haven't accumulated any consensi that are capable of masking the sample for this round ( "Masked: 0.00 %" ). I would like to take a look at this further. Could you send me ( rhubley@systemsbiology.org ) your full log output, and ideally the "RM_###*" directory tar'd&gzipd?
As for upgrading from Dfam 2.0 to 3.0, I would recommend not simply replacing the Dfam.hmm file but rather download the latest RepeatMasker package 4.0.9-p2 ( which includes it ). Dfam 3.0 is not compatible with previous versions of RepeatMasker. FYI, RepeatMasker is only used by RepeatModeler for a few steps. It reuses some utility modules for masking tandem repeats in genome samples, and running RMBlast/ABBlast all-vs-all searches, and it uses the RepeatMasker libraries in the RepeatClassifier step at the very end of a RepeatModeler run.
Thanks for the quick reply! It's great to know how the software call each other. I'll get a tarball over to you and work on installing 4.0.9-p2. I've been using 4.0.8 as part of a conda
install of FUNannotate, which I know is a double-edged sword, especially when the developers aren't building the module.
Hello, I'm trying to run the
RepeatModeler
+RepeatMasker
process after updating Dfam from 2.0 to 3.0. I hit no issues running with Dfam 2.0 but after downloading the newDfam.hmm.gz
and unzipping it in theRepeatMasker/Libraries
directory I'm having an issue generating a consensus file.BuildDatabase
ran fine but then when I reran my old batch file that had previously worked (RepeatModeler -pa 36 -engine ncbi -database Portulaca_amilis
) the output looked like the following,All of the families found by
RECON
cannot find a family withRefiner
. In the Dfam 3.0 documentation, the first noted change from 2.0 to 3.0 isWhich seems relevant because it appears we're finding sequences but failing to assign them consensus sequences for output. I noticed some commits related to the updated database on the
RepeatMasker
repo but I'm not sure what the status is for the pipeline overall. Am I interpreting this correctly, or is this another issue withRepeatMasker
orRefiner
?Thanks, Ian